Artificial Intelligence 21 min read

Which Python Causal Inference Library Wins? A Hands‑On Comparison of Six Tools

This article compares six popular Python causal inference libraries—Bnlearn, Pgmpy, CausalNex, DoWhy, PyAgrum, and CausalImpact—using the U.S. Census Income dataset to answer whether a graduate degree raises the probability of earning over $50K, and provides detailed code, pros, cons, and results for each tool.

Data Party THU

Nov 27, 2025

Which Python Causal Inference Library Wins? A Hands‑On Comparison of Six Tools

In data science, understanding why a variable influences an outcome is often more valuable than merely predicting it. Causal inference helps identify true drivers behind observed relationships, which is essential for predictive maintenance, marketing optimization, and strategic decision‑making.

To help practitioners choose the right tool, we evaluate six widely used Python causal inference libraries on the same benchmark: the U.S. Census Income dataset. The core question is whether holding a graduate degree (Doctorate) increases the probability of an annual income exceeding $50K.

1. Bnlearn

Bnlearn offers a complete Bayesian‑network toolbox supporting discrete, continuous, and mixed data. It bundles structure learning, parameter estimation, inference, synthetic data generation, discretization, model evaluation, and interactive visualisation.

Core functions: structure learning, parameter learning, causal inference, synthetic data, discretization, model evaluation, visualisation.

Pros: Full pipeline, easy to start, good visual output.

Cons: Limited to Bayesian networks; performance may drop with many variables.

# Install and import
pip install bnlearn
import bnlearn as bn
# Load and clean data
import datazets as dz
import pandas as pd
import numpy as np

df = dz.import_example(data='census_income')
drop_cols = ['age','fnlwgt','education-num','capital-gain','capital-loss','hours-per-week','race','sex']
df.drop(labels=drop_cols, axis=1, inplace=True)

# Unsupervised structure learning
model = bn.structure_learning.fit(df, methodtype='hillclimbsearch', scoretype='bic')
model = bn.independence_test(model, df, test='chi_square', alpha=0.05, prune=True)
model = bn.parameter_learning.fit(model, df)

# Inference examples
query = bn.inference.fit(model, variables=['salary'], evidence={'education':'Doctorate'})
print(query)
query = bn.inference.fit(model, variables=['salary'], evidence={'education':'HS-grad'})
print(query)

2. Pgmpy

Pgmpy provides low‑level building blocks for probabilistic graphical models, giving maximum flexibility at the cost of more manual work.

Pros: Highly flexible, suitable for custom algorithms.

Cons: Steeper learning curve, requires manual model assembly.

Input data: Must be discrete.

# Install
pip install pgmpy

from pgmpy.estimators import HillClimbSearch, BicScore
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination

est = HillClimbSearch(df)
score = BicScore(df)
model = est.estimate(scoring_method=score)
print('Learned edges:', model.edges())

infer = VariableElimination(model)
result = infer.query(variables=['salary'], evidence={'education':'Doctorate'})
print(result)

3. CausalNex

CausalNex implements the NOTEARS algorithm for structure learning, focusing on numeric discrete data.

Pros: Advanced NOTEARS algorithm, good for projects needing state‑of‑the‑art structure discovery.

Cons: Requires preprocessing, limited compatibility with newer Python versions.

Input data: Numeric discrete.

# Install
pip install causalnex

from causalnex.structure.notears import from_pandas
from causalnex.network import BayesianNetwork
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
df_num = df.apply(le.fit_transform)
sm = from_pandas(df_num)
sm.remove_edges_below_threshold(0.8)

import networkx as nx
nx.draw_networkx(sm, with_labels=True)

4. DoWhy

DoWhy focuses on estimating causal effects rather than learning the full graph. It requires the user to specify treatment and outcome variables and works with a supplied causal graph.

Pros: Rigorous effect‑estimation framework with built‑in robustness checks.

Cons: Cannot learn the causal graph automatically; results need statistical interpretation.

Input data: Binary treatment and outcome variables.

# Install
pip install dowhy

from dowhy import CausalModel
import datazets as dz
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder

# Load and clean data
df = dz.get(data='census_income')
drop_cols = ['age','fnlwgt','education-num','capital-gain','capital-loss','hours-per-week','race','sex']
df.drop(labels=drop_cols, axis=1, inplace=True)

# Binary treatment (Doctorate?)
df['education'] = df['education'] == 'Doctorate'

# Encode all columns
le = LabelEncoder()
for col in df.columns:
    df[col] = le.fit_transform(df[col])

model = CausalModel(
    data=df,
    treatment='education',
    outcome='salary',
    common_causes=list(df.columns[~df.columns.isin(['education','salary'])]),
    graph_builder='ges',
    alpha=0.05
)
model.view_model()
identified_estimand = model.identify_effect()
estimate = model.estimate_effect(identified_estimand, method_name='backdoor.propensity_score_stratification')
print(estimate)
refute = model.refute_estimate(identified_estimand, estimate, method_name='random_common_cause')
print(refute)

5. PyAgrum

PyAgrum is a versatile library supporting Bayesian networks, Markov networks, and other graphical models. It offers many learning algorithms but can be heavy for beginners.

Pros: Comprehensive model support, rich functionality.

Cons: Strict preprocessing requirements, visualization depends on Graphviz, smaller community.

Input data: Complete discrete dataset.

# Install
pip install pyagrum setgraphviz

import datazets as dz
import pandas as pd
import pyagrum as gum
from setgraphviz import setgraphviz

setgraphviz()

df = dz.get(data='census_income')
drop_cols = ['age','fnlwgt','education-num','capital-gain','capital-loss','hours-per-week','race','sex']
df.drop(labels=drop_cols, axis=1, inplace=True)

df = df.dropna().copy()
for col in df.columns:
    df[col] = df[col].astype('category')

learner = gum.BNLearner(df)
learner.useScoreBIC()
learner.useGreedyHillClimbing()
bn = learner.learnBN()
bn2 = learner.learnParameters(bn.dag())
# Visualise
import pyagrum.lib.notebook as gnb
gnb.showBN(bn2)

6. CausalImpact

CausalImpact is specialised for time‑series interventions. It fits a Bayesian structural time‑series model to estimate the effect of a single change (e.g., a product launch) on a metric.

Pros: Direct, visual interpretation of intervention impact on time series.

Cons: Only works with time‑series data; not suitable for general causal graphs.

Input data: Time‑series with a clearly defined intervention point.

# Install
pip install causalimpact

import pandas as pd
from causalimpact import CausalImpact

# Simulated data: y = traffic, x1 = control variable
data = pd.DataFrame({'y': y_data, 'x1': x1_data})
impact = CausalImpact(data, pre_period=[0, 69], post_period=[70, 99])
impact.run()
impact.plot()
print(impact.summary())

Final Recommendations

If you want a quick, out‑of‑the‑box solution that automatically discovers causal structure, choose Bnlearn .

If you need full control over every modelling step and are comfortable with low‑level APIs, go with Pgmpy .

When the primary goal is estimating a treatment effect with a rigorous statistical framework, DoWhy is the best fit.

For time‑series interventions, CausalImpact is the dedicated tool.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python causal inference CausalImpact DoWhy pgmpy library comparison bnlearn

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.