Artificial Intelligence 14 min read

End‑to‑End Time Series Forecasting with LSTM in Python

This tutorial walks through loading Google stock data, preprocessing it with scaling, constructing past‑window features, building and tuning an LSTM model using GridSearchCV, evaluating predictions, and finally forecasting future values, all illustrated with complete Python code.

Python Programming Learning Circle

Jul 14, 2022

End‑to‑End Time Series Forecasting with LSTM in Python

In many practical scenarios we need to forecast a target series, such as brand sales or product demand. This article demonstrates a complete end‑to‑end workflow for time‑series prediction using an LSTM network in Python.

Data loading and inspection

We read a CSV file containing Google stock data from 2001‑01‑25 to 2021‑09‑29, parse the Date column, and set the first column as the index.

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from sklearn.preprocessing import MinMaxScaler
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import GridSearchCV

df = pd.read_csv("train.csv", parse_dates=["Date"], index_col=[0])
print(df.shape)  # (5203, 5)

We aim to predict the Open column; therefore it is the target variable.

Train‑test split

Because time series must remain ordered, we split the data without shuffling: 80% for training and the last 20% for testing.

test_split = round(len(df) * 0.20)
df_for_training = df[:-1041]
df_for_testing = df[-1041:]
print(df_for_training.shape)  # (4162, 5)
print(df_for_testing.shape)   # (1041, 5)

Scaling

We apply a MinMaxScaler to bring all features into the range (0, 1).

scaler = MinMaxScaler(feature_range=(0, 1))
df_for_training_scaled = scaler.fit_transform(df_for_training)
df_for_testing_scaled = scaler.transform(df_for_testing)

Creating X and Y sequences

Using a sliding window of n_past = 30 time steps, we build input arrays trainX and testX that contain the previous 30 rows of all five features, and target arrays trainY and testY that contain the corresponding Open value at the next time step.

def createXY(dataset, n_past):
    dataX, dataY = [], []
    for i in range(n_past, len(dataset)):
        dataX.append(dataset[i - n_past:i, 0:dataset.shape[1]])
        dataY.append(dataset[i, 0])
    return np.array(dataX), np.array(dataY)

trainX, trainY = createXY(df_for_training_scaled, 30)
testX, testY   = createXY(df_for_testing_scaled, 30)
print("trainX Shape--", trainX.shape)  # (4132, 30, 5)
print("trainY Shape--", trainY.shape)  # (4132,)

Model definition and hyper‑parameter search

We define a function that builds a Sequential LSTM model with two LSTM layers, a dropout layer, and a dense output. GridSearchCV searches over batch size, epochs, and optimizer.

def build_model(optimizer):
    model = Sequential()
    model.add(LSTM(50, return_sequences=True, input_shape=(30, 5)))
    model.add(LSTM(50))
    model.add(Dropout(0.2))
    model.add(Dense(1))
    model.compile(loss='mse', optimizer=optimizer)
    return model

grid_model = KerasRegressor(build_fn=build_model, verbose=1, validation_data=(testX, testY))
parameters = {
    'batch_size': [16, 20],
    'epochs': [8, 10],
    'optimizer': ['adam', 'Adadelta']
}
grid_search = GridSearchCV(estimator=grid_model, param_grid=parameters, cv=2)
grid_search = grid_search.fit(trainX, trainY)
print(grid_search.best_params_)  # {'batch_size': 20, 'epochs': 10, 'optimizer': 'adam'}

Training the best model

The best estimator is extracted and stored as my_model.

my_model = grid_search.best_estimator_.model

Evaluation on the test set

We predict on testX, then inverse‑transform the scaled predictions back to the original price scale. Because the scaler expects five columns, we repeat the single‑column predictions five times before applying inverse_transform.

prediction = my_model.predict(testX)
prediction_copies = np.repeat(prediction, 5, axis=-1)
pred = scaler.inverse_transform(prediction_copies)[:, 0]

original_copies = np.repeat(testY, 5, axis=-1)
original = scaler.inverse_transform(original_copies.reshape(len(testY), 5))[:, 0]

We plot the real and predicted stock prices.

plt.plot(original, color='red', label='Real Stock Price')
plt.plot(pred, color='blue', label='Predicted Stock Price')
plt.title('Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('Google Stock Price')
plt.legend()
plt.show()

Forecasting future values

To predict the next 30 days, we take the last 30 rows of the original data, append a placeholder Open column of zeros for the future period, scale the combined dataset, and iteratively feed a sliding window into the trained model, replacing each NaN with the newly predicted value.

# Load the last 30 days
past_30 = df.iloc[-30:,:]
# Load future feature values (without Open)
future_features = pd.read_csv("test.csv", parse_dates=["Date"], index_col=[0])
future_features["Open"] = 0
future_features = future_features[["Open","High","Low","Close","Adj Close"]]

# Scale both parts
old_scaled = scaler.transform(past_30)
new_scaled = scaler.transform(future_features)
new_scaled_df = pd.DataFrame(new_scaled)
new_scaled_df.iloc[:,0] = np.nan
full_df = pd.concat([pd.DataFrame(old_scaled), new_scaled_df]).reset_index(drop=True)

# Iterative prediction
all_preds = []
for i in range(30, len(full_df)):
    x_input = full_df.iloc[i-30:i, :].values.reshape(1,30,5)
    pred = my_model.predict(x_input)
    all_preds.append(pred)
    full_df.iloc[i,0] = pred

# Inverse transform the future predictions
future_pred = np.array(all_preds).reshape(-1,1)
future_pred_copies = np.repeat(future_pred,5,axis=-1)
y_pred_future = scaler.inverse_transform(future_pred_copies)[:,0]
print(y_pred_future)

The script outputs a list of predicted Open prices for the next 30 days, completing the end‑to‑end forecasting pipeline.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Time Series Forecasting Keras LSTM stock price prediction

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.