Lesson 33: Recurrent Neural Networks (RNNs)

How neural networks learn from sequences such as text, audio, and time‑series data

What Are RNNs?

Recurrent Neural Networks (RNNs) are designed to process sequential data. Unlike regular neural networks, RNNs maintain a hidden state that carries information from previous steps in the sequence, allowing them to learn temporal patterns.

Why RNNs Are Useful

They handle variable‑length sequences.
They remember previous inputs through hidden states.
They are ideal for language, audio, and time‑series tasks.

How RNNs Work

At each time step, an RNN takes an input and the previous hidden state:

h_t = activation(Wx_t + Uh_{t-1} + b)

This allows information to flow through time.

Problems with Basic RNNs

Standard RNNs struggle with long sequences due to:

Vanishing gradients — network forgets earlier information.
Exploding gradients — unstable training.

LSTM and GRU Networks

To solve these issues, improved architectures were created:

LSTM (Long Short‑Term Memory) — uses gates to control information flow.
GRU (Gated Recurrent Unit) — simpler, faster alternative to LSTM.

Example: LSTM for Text Classification

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Embedding(input_dim=5000, output_dim=32),
    layers.LSTM(64),
    layers.Dense(1, activation="sigmoid")
])

model.compile(optimizer="adam",
              loss="binary_crossentropy",
              metrics=["accuracy"])

print(model.summary())

Applications of RNNs

Language modeling
Text generation
Speech recognition
Machine translation
Time‑series forecasting
Music generation

Why RNNs Matter Today

Although transformers have become the dominant architecture for sequence tasks, RNNs remain important for lightweight models, embedded systems, and understanding the foundations of sequence learning.

Next Steps

Now that you understand RNNs, you're ready to explore modern sequence models in Lesson 34: Introduction to Transformers.

← Back to Lesson Index