How to Quickly Analyze, Visualize, and Predict Stock Prices with Python in 12 Minutes

This tutorial walks you through loading historical stock data from Yahoo Finance with pandas, computing moving averages and returns, comparing multiple tech stocks, engineering features, training linear, quadratic, and K‑Nearest‑Neighbor models using scikit‑learn, evaluating their confidence scores, and visualizing short‑term price forecasts—all in under a dozen minutes of reading and coding.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Quickly Analyze, Visualize, and Predict Stock Prices with Python in 12 Minutes

Loading Yahoo Finance Data

We use pandas_datareader to fetch Apple (AAPL) adjusted close prices from 2010‑01‑01 to 2017‑01‑01.

import pandas as pd
import datetime
import pandas_datareader.data as web

start = datetime.datetime(2010, 1, 1)
end   = datetime.datetime(2017, 1, 11)

df = web.DataReader("AAPL", "yahoo", start, end)
print(df.tail())

Exploratory Analysis: Moving Average & Returns

Compute a 100‑day rolling average and plot it alongside the closing price.

close_px = df['Adj Close']
mavg = close_px.rolling(window=100).mean()
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rc('figure', figsize=(8, 7))
mpl.style.use('ggplot')
close_px.plot(label='AAPL')
mavg.plot(label='mavg')
plt.legend()
plt.show()

Calculate daily returns and visualize them.

rets = close_px / close_px.shift(1) - 1
rets.plot(label='return')
plt.legend()
plt.show()

Comparing Competitor Stocks

Fetch adjusted close prices for AAPL, GE, GOOG, IBM, and MSFT, then compute percentage changes and the correlation matrix.

dfcomp = web.DataReader(["AAPL", "GE", "GOOG", "IBM", "MSFT"], "yahoo", start, end)["Adj Close"]
retscomp = dfcomp.pct_change()
corr = retscomp.corr()
plt.scatter(retscomp.AAPL, retscomp.GE)
plt.xlabel('Returns AAPL')
plt.ylabel('Returns GE')
plt.show()

pd.scatter_matrix(retscomp, diagonal='kde', figsize=(10,10))
plt.show()

plt.imshow(corr, cmap='hot', interpolation='none')
plt.colorbar()
plt.xticks(range(len(corr)), corr.columns)
plt.yticks(range(len(corr)), corr.columns)
plt.show()

Feature Engineering

Create two additional features: high‑low percentage (HL_PCT) and percent change from open to close (PCT_change).

dfreg = df.loc[:, ['Adj Close', 'Volume']]
dfreg['HL_PCT'] = (df['High'] - df['Low']) / df['Close'] * 100.0
dfreg['PCT_change'] = (df['Close'] - df['Open']) / df['Open'] * 100.0

Pre‑processing & Cross‑validation

Handle missing values, define a forecast horizon, shift the label column, scale features, and split into training and test sets.

# Drop missing values
dfreg.fillna(value=-99999, inplace=True)

forecast_out = int(math.ceil(0.01 * len(dfreg)))
forecast_col = 'Adj Close'
dfreg['label'] = dfreg[forecast_col].shift(-forecast_out)

X = np.array(dfreg.drop(['label'], 1))
X = preprocessing.scale(X)
X_lately = X[-forecast_out:]
X = X[:-forecast_out]

y = np.array(dfreg['label'])
y = y[:-forecast_out]

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2)

Model Training

Train three regressors: simple linear regression, quadratic regression (degrees 2 and 3) using Ridge, and K‑Nearest‑Neighbors.

from sklearn.linear_model import LinearRegression, Ridge
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline

clfreg   = LinearRegression(n_jobs=-1)
clfpoly2 = make_pipeline(PolynomialFeatures(2), Ridge())
clfpoly3 = make_pipeline(PolynomialFeatures(3), Ridge())
clfknn   = KNeighborsRegressor(n_neighbors=2)

clfreg.fit(X_train, y_train)
clfpoly2.fit(X_train, y_train)
clfpoly3.fit(X_train, y_train)
clfknn.fit(X_train, y_train)

Evaluation

Score each model on the test set.

confidencereg   = clfreg.score(X_test, y_test)
confidencepoly2 = clfpoly2.score(X_test, y_test)
confidencepoly3 = clfpoly3.score(X_test, y_test)
confidenceknn  = clfknn.score(X_test, y_test)

print('Linear regression confidence:', confidencereg)
print('Quadratic regression (deg 2) confidence:', confidencepoly2)
print('Quadratic regression (deg 3) confidence:', confidencepoly3)
print('KNN regression confidence:', confidenceknn)

Forecasting

Generate future price predictions and plot them together with historical adjusted close values.

forecast_set = clfreg.predict(X_lately)
dfreg['Forecast'] = np.nan
last_date = dfreg.iloc[-1].name
last_unix = last_date
next_unix = last_unix + datetime.timedelta(days=1)

for i in forecast_set:
    next_date = next_unix
    next_unix += datetime.timedelta(days=1)
    dfreg.loc[next_date] = [np.nan for _ in range(len(dfreg.columns)-1)] + [i]

dfreg['Adj Close'].tail(500).plot()
dfreg['Forecast'].tail(500).plot()
plt.legend(loc=4)
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()

Future Improvements

Incorporate qualitative economic factors such as news sentiment analysis.

Include quantitative macro‑economic indicators (e.g., HPI, income inequality) to enrich the model.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonData visualizationpandasstock analysis
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.