Artificial Intelligence 5 min read

Understanding Recurrent Neural Networks: From Vanilla RNN to LSTM with Keras

This article introduces recurrent neural networks (RNNs) and their ability to handle sequential data, explains the limitations of vanilla RNNs, presents the LSTM architecture with its gates, and provides complete Keras code for data loading, model building, and training both vanilla RNN and LSTM models.

Model Perspective
Model Perspective
Model Perspective
Understanding Recurrent Neural Networks: From Vanilla RNN to LSTM with Keras

Recurrent Neural Networks

Feed‑forward networks such as MLPs and CNNs are powerful but cannot process sequential data because they lack memory of previous inputs; for tasks like language translation, context is required to predict the next word.

Vanilla RNN

Vanilla RNNs have a simple recurrent structure but suffer from the long‑term dependency problem, so they cannot retain memory over long sequences.

LSTM

LSTM (Long‑Short‑Term Memory) is an improved recurrent architecture that solves the long‑term dependency issue. It replaces the standard recurrent layer with LSTM cells composed of an input gate, a forget gate, and an output gate. Below is a diagram of an LSTM cell:

Loading Libraries

<code>import numpy as np
from sklearn.metrics import accuracy_score
from tensorflow.keras.datasets import reuters
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical</code>

Loading Data and Splitting

<code># parameters for data load
num_words = 30000
maxlen = 50
test_split = 0.3
(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words=num_words, maxlen=maxlen, test_split=test_split)</code>
<code># pad the sequences with zeros
# padding parameter is set to 'post' => 0's are appended to end of sequences
X_train = pad_sequences(X_train, padding='post')
X_test = pad_sequences(X_test, padding='post')

X_train = np.array(X_train).reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = np.array(X_test).reshape((X_test.shape[0], X_test.shape[1], 1))

y_data = np.concatenate((y_train, y_test))
y_data = to_categorical(y_data)

y_train = y_data[:1395]
y_test = y_data[1395:]
</code>

Loading Model

<code>from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, SimpleRNN, Activation
from tensorflow.keras import optimizers
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier</code>

Vanilla RNN

<code>def vanilla_rnn():
    model = Sequential()
    model.add(SimpleRNN(50, input_shape=(49,1), return_sequences=False))
    model.add(Dense(46))
    model.add(Activation('softmax'))

    adam = optimizers.Adam(lr=0.001)
    model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

    return model
</code>

Model Training

<code>model = KerasClassifier(build_fn=vanilla_rnn, epochs=200, batch_size=50, verbose=1)
model.fit(X_train, y_train)
</code>

LSTM

<code>from tensorflow.keras.layers import LSTM

def lstm():
    model = Sequential()
    model.add(LSTM(50, input_shape=(49,1), return_sequences=False))
    model.add(Dense(46))
    model.add(Activation('softmax'))

    adam = optimizers.Adam(lr=0.001)
    model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

    return model
</code>

Model Training

<code>model = KerasClassifier(build_fn=lstm, epochs=200, batch_size=50, verbose=1)
model.fit(X_train, y_train)
</code>
deep learningKerasLSTMRNNsequence modeling
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.