Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python
This article explains the motivations behind Monte Carlo methods, introduces acceptance-rejection sampling, details Markov Chain Monte Carlo concepts, and walks through Metropolis-Hastings and Gibbs sampling algorithms with Python implementations, highlighting their use in high‑dimensional probability distribution sampling.
Monte Carlo Simulation
Why MCMC is Needed
Motivation 1
If you need to sample a one‑dimensional random variable with a discrete sample space, you can simply partition the interval proportionally to the probabilities and draw a uniform random number to select a region. For continuous distributions, the inverse CDF method works only when the density is integrable and the CDF has a closed‑form inverse. When these conditions fail, Markov Chain Monte Carlo (MCMC) becomes necessary.
Motivation 2
In high‑dimensional spaces, enumerating all grid points quickly becomes infeasible (the "curse of dimensionality"). MCMC provides a way to sample without exhaustive enumeration.
Uniform Distribution and Box‑Muller Transform
Uniform random numbers are easy to generate. Many common distributions can be derived from uniform samples; for example, the Box‑Muller transform converts two independent uniform variables into independent standard normal variables. Similar transformations exist for exponential, Gamma, Beta, Dirichlet, and other distributions. However, for complex or high‑dimensional target densities, direct sampling may be impossible.
Acceptance‑Rejection Sampling
Goal: generate samples from a target density f(x) using a proposal distribution g(x) that is easy to sample.
Prepare a constant c such that f(x) ≤ c·g(x) for all x.
Draw a candidate x from g(x).
Draw a uniform y in [0, c·g(x)].
If y ≤ f(x), accept x; otherwise reject and repeat.
Intuitive Explanation of Acceptance‑Rejection
Visualize the proposal distribution as a rectangle covering the target density. Points falling under the target curve are accepted; points above are rejected.
Python Implementation of Acceptance‑Rejection
Assume the target density is f(x) and we use a uniform proposal.
<code>import random
import math
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
%matplotlib inline
sns.set_style('darkgrid')
plt.rcParams['figure.figsize'] = (12, 8)
def AceeptReject(split_val):
global c, power
while True:
x = random.uniform(0, 1)
y = random.uniform(0, 1)
if y * c <= math.pow(x - split_val, power):
return x
power = 4
t = 0.4
sum_ = (math.pow(1 - t, power + 1) - math.pow(-t, power + 1)) / (power + 1)
c = 0.6**4 / sum_
x = np.linspace(0, 1, 100)
cc = [c for _ in x]
plt.plot(x, cc, '--', label='c*f(x)')
y = [math.pow(xi - t, power) / sum_ for xi in x]
plt.plot(x, y, label='f(x)')
samples = []
for i in range(10000):
samples.append(AceeptReject(t))
plt.hist(samples, bins=50, density=True, label='sampling')
plt.legend()
plt.show()
</code>Monte Carlo Summary
Acceptance‑rejection sampling can generate samples for many distributions, but it may fail for complex or high‑dimensional targets, prompting the need for Markov chain methods.
Markov Chain Sampling
Convergence Properties
A non‑periodic, irreducible Markov chain has a unique stationary distribution independent of the initial state.
Sampling via Markov Chains
Given a transition matrix that has the desired stationary distribution, repeatedly sample from conditional distributions until the chain mixes; the resulting states approximate the target distribution.
Markov Chain Monte Carlo (MCMC) Algorithm
When the target stationary distribution does not naturally satisfy detailed balance with a simple transition matrix, introduce an acceptance probability analogous to acceptance‑rejection to adjust the chain.
Metropolis‑Hastings (M‑H) Sampling
The Metropolis‑Hastings algorithm improves acceptance rates by scaling the acceptance probability, allowing efficient sampling from arbitrary target densities.
M‑H Sampling Python Implementation
<code>from scipy.stats import norm
def norm_dist_prob(theta):
return norm.pdf(theta, loc=3, scale=2)
T = 5000
pi = [0 for _ in range(T)]
sigma = 1
t = 0
while t < T - 1:
t += 1
pi_star = norm.rvs(loc=pi[t - 1], scale=sigma, size=1)
alpha = min(1, norm_dist_prob(pi_star[0]) / norm_dist_prob(pi[t - 1]))
u = random.uniform(0, 1)
if u < alpha:
pi[t] = pi_star[0]
else:
pi[t] = pi[t - 1]
plt.scatter(pi, norm.pdf(pi, loc=3, scale=2), label='Target Distribution')
plt.hist(pi, bins=50, density=True, facecolor='red', alpha=0.7, label='Samples Distribution')
plt.legend()
plt.show()
</code>Challenges of M‑H in High Dimensions
Low acceptance rates cause many rejections, making the algorithm inefficient.
Joint distributions are often intractable, but conditional distributions are easier to compute, motivating Gibbs sampling.
Gibbs Sampling
Two‑Dimensional Gibbs Sampling
Given a bivariate normal distribution, alternate sampling from the conditional distributions of each variable.
<code>from mpl_toolkits.mplot3d import Axes3D
from scipy.stats import multivariate_normal
samplesource = multivariate_normal(mean=[5, -1], cov=[[1, 0.5], [0.5, 2]])
def p_ygivenx(x, m1, m2, s1, s2):
return random.normalvariate(m2 + rho * s2 / s1 * (x - m1), math.sqrt(1 - rho**2) * s2)
def p_xgiveny(y, m1, m2, s1, s2):
return random.normalvariate(m1 + rho * s1 / s2 * (y - m2), math.sqrt(1 - rho**2) * s1)
N, K = 5000, 20
x_res, y_res, z_res = [], [], []
m1, m2, s1, s2 = 5, -1, 1, 2
rho = 0.5
y = m2
for i in range(N):
for j in range(K):
x = p_xgiveny(y, m1, m2, s1, s2)
y = p_ygivenx(x, m1, m2, s1, s2)
z = samplesource.pdf([x, y])
x_res.append(x)
y_res.append(y)
z_res.append(z)
plt.hist(x_res, bins=50, density=True, facecolor='green', alpha=0.5, label='x')
plt.hist(y_res, bins=50, density=True, facecolor='red', alpha=0.5, label='y')
plt.title('Histogram')
plt.legend()
plt.show()
</code>Gibbs Sampling Summary
Gibbs sampling excels in high‑dimensional settings because each step updates only one coordinate using its conditional distribution, effectively mitigating the curse of dimensionality. It is a special case of MCMC and is widely used when joint densities are hard to evaluate but conditionals are tractable.
Source
https://zhuanlan.zhihu.com/p/37121528
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.