Which Python Causal Inference Library Wins? A Hands‑On Comparison of Six Tools
This article compares six popular Python causal inference libraries—Bnlearn, Pgmpy, CausalNex, DoWhy, PyAgrum, and CausalImpact—using the U.S. Census Income dataset to answer whether a graduate degree raises the probability of earning over $50K, and provides detailed code, pros, cons, and results for each tool.
In data science, understanding why a variable influences an outcome is often more valuable than merely predicting it. Causal inference helps identify true drivers behind observed relationships, which is essential for predictive maintenance, marketing optimization, and strategic decision‑making.
To help practitioners choose the right tool, we evaluate six widely used Python causal inference libraries on the same benchmark: the U.S. Census Income dataset. The core question is whether holding a graduate degree (Doctorate) increases the probability of an annual income exceeding $50K.
1. Bnlearn
Bnlearn offers a complete Bayesian‑network toolbox supporting discrete, continuous, and mixed data. It bundles structure learning, parameter estimation, inference, synthetic data generation, discretization, model evaluation, and interactive visualisation.
Core functions: structure learning, parameter learning, causal inference, synthetic data, discretization, model evaluation, visualisation.
Pros: Full pipeline, easy to start, good visual output.
Cons: Limited to Bayesian networks; performance may drop with many variables.
# Install and import
pip install bnlearn
import bnlearn as bn
# Load and clean data
import datazets as dz
import pandas as pd
import numpy as np
df = dz.import_example(data='census_income')
drop_cols = ['age','fnlwgt','education-num','capital-gain','capital-loss','hours-per-week','race','sex']
df.drop(labels=drop_cols, axis=1, inplace=True)
# Unsupervised structure learning
model = bn.structure_learning.fit(df, methodtype='hillclimbsearch', scoretype='bic')
model = bn.independence_test(model, df, test='chi_square', alpha=0.05, prune=True)
model = bn.parameter_learning.fit(model, df)
# Inference examples
query = bn.inference.fit(model, variables=['salary'], evidence={'education':'Doctorate'})
print(query)
query = bn.inference.fit(model, variables=['salary'], evidence={'education':'HS-grad'})
print(query)2. Pgmpy
Pgmpy provides low‑level building blocks for probabilistic graphical models, giving maximum flexibility at the cost of more manual work.
Pros: Highly flexible, suitable for custom algorithms.
Cons: Steeper learning curve, requires manual model assembly.
Input data: Must be discrete.
# Install
pip install pgmpy
from pgmpy.estimators import HillClimbSearch, BicScore
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination
est = HillClimbSearch(df)
score = BicScore(df)
model = est.estimate(scoring_method=score)
print('Learned edges:', model.edges())
infer = VariableElimination(model)
result = infer.query(variables=['salary'], evidence={'education':'Doctorate'})
print(result)3. CausalNex
CausalNex implements the NOTEARS algorithm for structure learning, focusing on numeric discrete data.
Pros: Advanced NOTEARS algorithm, good for projects needing state‑of‑the‑art structure discovery.
Cons: Requires preprocessing, limited compatibility with newer Python versions.
Input data: Numeric discrete.
# Install
pip install causalnex
from causalnex.structure.notears import from_pandas
from causalnex.network import BayesianNetwork
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df_num = df.apply(le.fit_transform)
sm = from_pandas(df_num)
sm.remove_edges_below_threshold(0.8)
import networkx as nx
nx.draw_networkx(sm, with_labels=True)4. DoWhy
DoWhy focuses on estimating causal effects rather than learning the full graph. It requires the user to specify treatment and outcome variables and works with a supplied causal graph.
Pros: Rigorous effect‑estimation framework with built‑in robustness checks.
Cons: Cannot learn the causal graph automatically; results need statistical interpretation.
Input data: Binary treatment and outcome variables.
# Install
pip install dowhy
from dowhy import CausalModel
import datazets as dz
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder
# Load and clean data
df = dz.get(data='census_income')
drop_cols = ['age','fnlwgt','education-num','capital-gain','capital-loss','hours-per-week','race','sex']
df.drop(labels=drop_cols, axis=1, inplace=True)
# Binary treatment (Doctorate?)
df['education'] = df['education'] == 'Doctorate'
# Encode all columns
le = LabelEncoder()
for col in df.columns:
df[col] = le.fit_transform(df[col])
model = CausalModel(
data=df,
treatment='education',
outcome='salary',
common_causes=list(df.columns[~df.columns.isin(['education','salary'])]),
graph_builder='ges',
alpha=0.05
)
model.view_model()
identified_estimand = model.identify_effect()
estimate = model.estimate_effect(identified_estimand, method_name='backdoor.propensity_score_stratification')
print(estimate)
refute = model.refute_estimate(identified_estimand, estimate, method_name='random_common_cause')
print(refute)5. PyAgrum
PyAgrum is a versatile library supporting Bayesian networks, Markov networks, and other graphical models. It offers many learning algorithms but can be heavy for beginners.
Pros: Comprehensive model support, rich functionality.
Cons: Strict preprocessing requirements, visualization depends on Graphviz, smaller community.
Input data: Complete discrete dataset.
# Install
pip install pyagrum setgraphviz
import datazets as dz
import pandas as pd
import pyagrum as gum
from setgraphviz import setgraphviz
setgraphviz()
df = dz.get(data='census_income')
drop_cols = ['age','fnlwgt','education-num','capital-gain','capital-loss','hours-per-week','race','sex']
df.drop(labels=drop_cols, axis=1, inplace=True)
df = df.dropna().copy()
for col in df.columns:
df[col] = df[col].astype('category')
learner = gum.BNLearner(df)
learner.useScoreBIC()
learner.useGreedyHillClimbing()
bn = learner.learnBN()
bn2 = learner.learnParameters(bn.dag())
# Visualise
import pyagrum.lib.notebook as gnb
gnb.showBN(bn2)6. CausalImpact
CausalImpact is specialised for time‑series interventions. It fits a Bayesian structural time‑series model to estimate the effect of a single change (e.g., a product launch) on a metric.
Pros: Direct, visual interpretation of intervention impact on time series.
Cons: Only works with time‑series data; not suitable for general causal graphs.
Input data: Time‑series with a clearly defined intervention point.
# Install
pip install causalimpact
import pandas as pd
from causalimpact import CausalImpact
# Simulated data: y = traffic, x1 = control variable
data = pd.DataFrame({'y': y_data, 'x1': x1_data})
impact = CausalImpact(data, pre_period=[0, 69], post_period=[70, 99])
impact.run()
impact.plot()
print(impact.summary())Final Recommendations
If you want a quick, out‑of‑the‑box solution that automatically discovers causal structure, choose Bnlearn .
If you need full control over every modelling step and are comfortable with low‑level APIs, go with Pgmpy .
When the primary goal is estimating a treatment effect with a rigorous statistical framework, DoWhy is the best fit.
For time‑series interventions, CausalImpact is the dedicated tool.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Data Party THU
Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
