Fundamentals 9 min read

Understanding Markov Chains: From Basics to Convergence and Sampling

This article explains the fundamentals of Markov chains, illustrates their transition matrix with a market example, demonstrates convergence through Python code, and outlines how to use the stationary distribution for sampling in Monte Carlo simulations.

Model Perspective
Model Perspective
Model Perspective
Understanding Markov Chains: From Basics to Convergence and Sampling

Overview of Markov Chains

Markov chains assume that the probability of transitioning to a new state depends only on the current state, not on earlier history. This simplification makes them useful in many time‑series models such as RNNs, HMMs, and MCMC.

Formally, if the sequence of states is \(X_0, X_1, \dots\), the conditional probability of \(X_{t+1}\) given the past depends only on \(X_t\): \[P(X_{t+1}=j\mid X_t=i)=P_{ij}.\]

Example: a three‑state market model (Bull, Bear, Stagnant) with a transition matrix that specifies probabilities such as a 0.025 chance of moving from Bull to Stagnant.

The transition matrix can be written as a 3×3 matrix where each entry \(P_{ij}\) is the probability of moving from state \(i\) to state \(j\).

Properties of the Transition Matrix

Using the market example, we start with an initial distribution (e.g., \([0.3, 0.4, 0.3]\)) and repeatedly multiply by the matrix. The Python code below demonstrates the iteration.

<code>import numpy as np
matrix = np.matrix([[0.9,0.075,0.025],
                    [0.15,0.8,0.05],
                    [0.25,0.25,0.5]], dtype=float)
vector1 = np.matrix([[0.3,0.4,0.3]], dtype=float)
for i in range(100):
    vector1 = vector1 * matrix
    print("Current round:", i+1)
    print(vector1)
</code>

After about 60 iterations the distribution stabilizes at \([0.625, 0.3125, 0.0625]\) regardless of the initial vector, as shown by a second run with a different starting distribution.

<code>matrix = np.matrix([[0.9,0.075,0.025],
                    [0.15,0.8,0.05],
                    [0.25,0.25,0.5]], dtype=float)
vector1 = np.matrix([[0.7,0.1,0.2]], dtype=float)
for i in range(100):
    vector1 = vector1 * matrix
    print("Current round:", i+1)
    print(vector1)
</code>

This convergence property holds for most irreducible, aperiodic Markov chains, both discrete and continuous.

Markov Chain Sampling

If we know the stationary distribution, we can generate samples by starting from any simple distribution (e.g., Gaussian), repeatedly applying the transition matrix, and collecting the state after a sufficient number of steps. The procedure is:

Provide the transition matrix and the desired number of steps.

Sample an initial state from a simple distribution.

Iteratively sample the next state from the conditional distribution defined by the matrix.

Collect the states after the burn‑in period as samples from the stationary distribution.

Obtaining a transition matrix that yields a prescribed stationary distribution is non‑trivial; Markov Chain Monte Carlo (MCMC) methods such as Metropolis‑Hastings and Gibbs sampling address this problem, which will be discussed in a future article.

samplingmonte carloMarkov ChainStochastic Processconvergence
Model Perspective
Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.