LoginSignup
0
1

More than 3 years have passed since last update.

LSTM structure

Last updated at Posted at 2019-10-14

LSTM overview

memory_unit.png

Formualae

First of all, some notations below do not match those in the figure above.

Three gates

input $\boldsymbol{i}_t$, forget $\boldsymbol{f}_t$, output $\boldsymbol{o}_t $
Basic form of gate signals are;

\sigma(\boldsymbol{W}_{*}\boldsymbol{h}_{t - 1} + b_{*})

representing activation func(recur weight * past output + weight * current input + bias)

\boldsymbol{i}_t = \sigma(\boldsymbol{W}_i\boldsymbol{h}_{t - 1} + \boldsymbol{U}_{i}x_t + \boldsymbol{b}_i) \\
\boldsymbol{f}_t = \sigma(\boldsymbol{W}_f\boldsymbol{h}_{t - 1} + \boldsymbol{U}_{f}x_t + \boldsymbol{b}_i) \\
\boldsymbol{o}_t = \sigma(\boldsymbol{W}_o\boldsymbol{h}_{t - 1} + \boldsymbol{U}_{o}x_t + \boldsymbol{b}_o) \\

Activation function explained (briefly)

As $\sigma \in [0, 1]$, it controls how much of a received information is passed to the next step of learning.

\sigma = 1 \Rightarrow \text{info is fully "preserved"} \\
\sigma = 0 \Rightarrow \text{info is completely "discarded"}

On the other hand, $\tanh$ is used because it satisfies $\tanh \in [-1, +1]$. This means that it regulates a received signal scale into $[-1, +1]$.

Cell state

  • candidate cell state $\tilde{\boldsymbol{c}}_{t}$
\tilde{{\boldsymbol{c}}_{t}} = \tanh(\boldsymbol{W}_c\boldsymbol{h}_{t - 1} + \boldsymbol{U}_{c}x_t + \boldsymbol{b}_c) $$
  • cell state $\boldsymbol{c}_t$
 \boldsymbol{c}_t = \boldsymbol{i}_t \odot \tilde{\boldsymbol{c}}_{t} + \boldsymbol{f}_t \odot \boldsymbol{c}_{t - 1}

representing "how much input is taken in" + "how much past output in forgotten/inherited"

Output

The final outcome $\boldsymbol{h}_t$ ($z_{t, j}$ in the figure) is used both as the next "past output"(recur) and "new input"(to the next node).

$$ \boldsymbol{h}_t = \boldsymbol{o}_t \odot \tanh(\boldsymbol{c}_t)$$

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1