Fundamentals 5 min read

How to Build and Forecast an ARMA Model in Python Using Sunspot Data

Learn to load the sunspots.csv dataset, fit an ARMA(4,2) model with statsmodels, evaluate it using AIC/BIC, visualize original and predicted values, and forecast the 1989 sunspot count, all with step‑by‑step Python code and plots.

Model Perspective
Model Perspective
Model Perspective
How to Build and Forecast an ARMA Model in Python Using Sunspot Data

ARMA Model Solution in Python

Using the sunspots.csv file, we build an appropriate ARMA model to predict the number of sunspots in 1989. The statsmodels.api.tsa.ARMA() function is used for fitting.

Code

<code>import pandas as pd, numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt
plt.rc('font',family='SimHei'); plt.rc('font',size=16)

d=pd.read_csv('sunspots.csv',usecols=['counts'])
md=sm.tsa.ARMA(d,(9,1)).fit()
years=np.arange(1700,1989)  # known observation years

dhat=md.predict()
plt.plot(years[-20:],d.values[-20:],'o-k')
plt.plot(years[-20:],dhat.values[-20:],'P--')
plt.legend(('Original observations','Predicted values'))
plt.show()

dnext=md.predict(d.shape[0],d.shape[0])
print(dnext)  # show next period prediction
</code>

Below is a complete modeling procedure.

Step 1: Plot the original data series to confirm stationarity.

Step 2: Use AIC and BIC criteria to select ARMA(4,2), fit the model with Python, and obtain residuals and their distribution.

Step 3: Using the fitted model, predict the 1989 sunspot count as 139, and compare with the original data.

Code

<code>import pandas as pd, numpy as np
import statsmodels.api as sm
import matplotlib.pylab as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

plt.rc('axes',unicode_minus=False)
plt.rc('font',family='SimHei'); plt.rc('font',size=16)

d=pd.read_csv('sunspots.csv'); dd=d['counts']
years=d['year'].values.astype(int)

plt.plot(years,dd.values,'-*'); plt.figure()
ax1=plt.subplot(121); plot_acf(dd,ax=ax1,title='Autocorrelation')
ax2=plt.subplot(122); plot_pacf(dd,ax=ax2,title='Partial Autocorrelation')

for i in range(1,6):
    for j in range(1,6):
        md=sm.tsa.ARMA(dd,(i,j)).fit()
        print([i,j,md.aic,md.bic])

zmd=sm.tsa.ARMA(dd,(4,2)).fit()
print(zmd.summary())  # display model information

residuals = pd.DataFrame(zmd.resid)
fig, ax = plt.subplots(1,2)
residuals.plot(title="Residuals", ax=ax[0])
residuals.plot(kind='kde', title='Density', ax=ax[1])
plt.legend(''); plt.ylabel('')

dhat=zmd.predict(); plt.figure()
plt.plot(years[-20:],dd.values[-20:],'o-k')
plt.plot(years[-20:],dhat.values[-20:],'P--')
plt.legend(('Original observations','Predicted values'))

dnext=zmd.predict(d.shape[0],d.shape[0])
print(dnext)  # show next period prediction
plt.show()
</code>
pythonforecastingtime seriesstatsmodelsARMA
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.