https://miro.medium.com/max/1200/0*R0-Mnurj5GykrJ5B

Original Source Here

# Understanding the basics of LSTM-units

Long Short-Term Memory (LSTM) is one of the most successful recurrent neural networks in modern real world applications because of its clever use of gates to keep or discard long and short-term information in its memory.

It was introduced in the Long short-term memory paper by *Hochreiter & Schmidhuber (1997)*, and later on refined in the paper by *Gers Felix et al. (2000)* with the addition of a gate to the weights, making LSTM cells compatible with different sequence lengths. An LSTM unit as seen in *figure 2 *differs from a standard RNN unit *figure 1 *in a lot of ways.

LINK TO ARTICLE ABOUT RNN

LSTM mitigates the issue a standard RNN has with long-term dependencies, by having dedicated long and short-term in and output.

*Figure 2 *shows a graphical representation of an LSTM unit. The LSTM unit have several neural network layers inside (*dark gray boxes)* the layers are all labelled with their activation functions (σ; tanh). The connection is represented by the arrows and the pointwise operations (addition, multiplication) are represented with their respective mathematical sign. A LSTM cell have several outputs, one output *h*** ₜ** to the layer ahead of it and another output

*h*

**and c**

*ₜ***to the next LSTM unit in the temporal space**

*ₜ**t*.

The job of an LSTM unit is to decide what information to remember and what to forget. One can look at an LSTM as a set of steps within the cell. The steps below explains what happens inside the LSTM in *figure 2*.

Even though, LSTM networks is very successful when used for timeseries applications, the network still suffers from the vanishing and exploding gradients problem explained in LINK TO RNN ARTICLE. This means that the problem must still be addressed when building an LSTM network.

**References**

Gers Felix, A., Jurgen, S. & Cummins, F. (2000), `Learning to forget: Continual prediction with lstm’, Neural computation 12(10), 2451{2471.

Hochreiter, S. & Schmidhuber, J. (1997), `Long short-term memory’, Neural computation 9(8), 1735-1780.

Goodfellow, I., Bengio, Y. & Courville, A. (2016), Deep Learning, MIT Press.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot