More than 5 years have passed since last update.

Deep Learning Specialization (Coursera) 自習記録 (RNN W2)

Last updated at 2020-08-13Posted at 2020-08-10

はじめに

Deep Learning Specialization の RNN, Week 2 (RNN W2) の内容です。

(RNN W2L01) Word Representation

内容

Word representation

1-hot representation
$V = [a, aaron, ... , zulu, <\textrm{UNK}>] $ ($|V| = 10000$)
例えば man は 10000 要素のベクトルの 5391 番目が 1, それ以外がゼロ ; $O_{5381}$

Featurized representation : word embedding

	Man	Woman	King	Queen	Apple	Orange
Gender	-1	1	-0.95	0.97	0.00	0.00
Royal	0.01	0.02	0.93	0.95	-0.01	0.00
Age	0.03	0.02	0.7	0.69	0.03	-0.02
Food	0.09	0.01	0.02	0.01	0.95	0.97
Size
Cost
Verb
...

この縦のベクトルを Man なら $e_{5381}$ と表現する

Visualization word emebeddings

t-SNE アルゴリズム

(RNN W2L02) Using Word Embeddings

内容

Named entity recognition example
Transfer learning and word embedding
1. Learn word embeddings from large text corpus (1B - 100B words) (Or download pre-trained embedding online)
2. Transfer embedding to new task with smaller training set (Say, 100k words)
3. Optional : Continue to finetune the word embeddings with the new data
Face encoding と考え方が似ている

(RNN W2L03) Properties of word embedding

内容

Analogies

Man → Woman as King → ?
$e_{man} - e_{woman} \sim e_{king} -e_{?}$ となる ? を探す

Analogies using word vectors

Find word $w$

\textrm{arg} \max_w \textrm{sim} \left( e_w, e_{king} - e_{man} + e_{women} \right)

Cosine similarity


\textrm{sim}\left( u, v \right) = \frac{u^T v}{\|u\|_2 \|v\|_2}

ユークリッド距離 ($||u - v||^2$) でも良い (非線形になるけど)

(RNN W2L04) Embedding matrix

内容

$E o_j = e_j$ ; embedding for word $j$
In practice, use special function to look up an embeddings. いちいち行列とベクトルの掛け算をして $e_j$ を求めるのは計算コストが高いので...

(RNN W2L05)

内容

Neural language model

例文 ; I want a glass of orange ____.
直前 4 words だけ使う (4 ; hyperparameter) と，入力は 300 * 4 = 1200 次元。ここから Softmax で単語を選ぶ

Other context/target unit

例文 ; I want a glass of orange juice to go along with my cereal.
Context の選び方
- Last 4 words (前述)
- 4 words on left & right
- last 1 word
- nearby 1 word

(RNN W2L06)

内容

Skip-grams

例文 ; I want a glass of orange juice to go along with my cereal.

Context	Target
orange	juice
orange	glass
orange	my

Model

Vocab size = 10k
context $c$ ("orange") → target $t$ ("juice")
$o_c$ → $E$ → $e_c$ → 〇(softmax) → $\hat{y}$


p\left( t | c \right) = \frac{e^{\theta_t^T e_c}}{\sum_{j=1}^{10000} e^{\theta_j^T e_c}}

$\theta_t$ ; parameter associate with output $t$

L\left( \hat{y}, y \right) = -\sum_{i=1}^{10000} y_i \log\hat{y}_i

Problems with softmax classification

$e_c$ の探索に時間がかかる → Hierarchical softmax

(RNN W2L07) Negative sampling

内容

Defining a new language problem

例文 ; I want a glass of orange juice to go along with my cereal.

context	word	target
orange	juice	- (positive)
orange	king	o (negative)
orange	book	o
orange	the	o

ランダムに word を選ぶ
単語数 $k$ は
- $k = 5 \sim 20$ ; smaller data set
- $k = 2 \sim 5$ ; larger data set

Model

softmax


p\left( y=1 | c, t \right) = \sigma\left(\theta_t^T e_c \right)

全部の word でトレーニングするのではなく，ランダムに選ばれた $k$ 個の word でトレーニングする

Selecting negative example


p(w_i) = \frac{f(w_i)^\frac{3}{4}}{\sum_{j=1}^{10000} f(w_j)^\frac{3}{4}}

感想

実はよく分かっていない

(RNN W2L08) GloVe word vectors

内容

$X_{ij}$ ; #times $i$ appears in context of $j$

Model


\textrm{minimize} \sum_{i=1}^{10000} \sum_{j=1}^{10000} f(X_{ij}) \left( \theta_i^T e_j + b_i - b_j^\prime - \log X_{ij}  \right)^2 \\
f\left( X_{ij} \right) = 0 \ \ \textrm{if} \ X_{ij} = 0 \\
e_w^{(final)} = \frac{e_w + \theta_w}{2}

$\theta_i$, $e_j$ are symmetric

(RNN W2L09) Sentiment classification

内容

Sentiment classification

Sentence	# stars
The dessert is excelent	$\star \star \star \star $
Service was quite slow	$\star \star $
Good for a quick meal, but nothing special	$\star \star \star$

単純に各単語の $e_w$ の平均を取って softmax してはダメ (言葉の順番も重要)

RNN for sentiment classification

many-to-one の考え方を使う

(RNN W2L10) Debiasing word embeddings

内容

The problem of bias in word embeddings

	judge
Man : Woman as King : Queen	no problem
Man : Computer_Programmer as Woman : Homemaker	problem
Father : Doctor as Mother : Nurse	problem

Word embeddings can reflect gender, ethnicity, age, sexual orientation, and other bias of text used to train the model.

Addressing bias in word embedding

Identify bias direction
- $e_{he} - e_{she}$
- $e_{male} -e_{female}$
Neutralize : For every word that is not definitional, project to get rid of bias
Equalize pairs
- grandfather, grandmother
- boy, girl

参考

Deep Learning Specialization (Coursera) 自習記録 (目次)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up