Causal Inference for Recommender Systems: Fundamentals, the MACR Model, and Practical Experiments
This article introduces causal inference concepts, explains structural causal and potential‑outcome frameworks, presents the MACR model for debiasing popularity in recommender systems, and details two experiments conducted on the ZhaiZhai platform along with future research directions.
1. Introduction to Causal Inference
Causal inference studies the relationship between causes and effects, distinguishing correlation from true causation. It aims to estimate the impact of interventions, such as how a user’s decision would change under a different recommendation.
1.1 What Is Causality
Causality describes how a cause leads to an effect, e.g., whether higher education increases personal income.
1.2 Correlation Does Not Imply Causation
Two events may be correlated because of a hidden confounder, such as drinking alcohol before sleep causing both wearing shoes while sleeping and waking up with a headache.
1.3 Representative Causal Frameworks
1.3.1 Structural Causal Model (SCM)
SCM represents variables and their causal relations as a directed acyclic graph (DAG) and defines structural functions to compute each node’s value from its parents.
Typical DAG structures include chain, fork, and collider, each illustrated with recommendation‑system examples.
1.3.2 Potential Outcome Framework
This framework defines a treatment (intervention) variable and a result variable, allowing the estimation of individual treatment effect (ITE) and average treatment effect (ATE) without relying on a causal graph.
2. The MACR Model
2.1 Background
The authors argue that a user’s rating depends on user‑item matching, item popularity, and user conformity. Existing models focus only on matching, ignoring popularity bias.
MACR introduces a model‑agnostic counterfactual reasoning framework that trains recommendation models on a causal graph to eliminate popularity bias during inference.
2.2 Counterfactual Reasoning
Counterfactual reasoning isolates the direct effect of popularity by intervening on the causal graph, allowing the model to predict scores based solely on user‑item matching.
2.3 Framework
MACR uses multi‑task learning with three branches (bias, preference, and traditional ranking) and combines their outputs for the final prediction. The overall loss includes the standard BCE loss plus two auxiliary losses.
3. Practical Experiments at ZhaiZhai
3.1 Experiment One
A two‑stage training separates bias features (static item attributes) from preference features. The bias branch is trained first, then the preference branch, with their outputs summed before the sigmoid. No significant improvement was observed.
3.2 Experiment Two
Using the MACR causal graph, the bias term for users was omitted due to data sparsity. The combined loss incorporates hyper‑parameters (λ=0.1, μ=30). Online A/B testing showed +1.95% pCTR and +0.70% uCTR, reducing the “rich‑get‑richer” effect.
3.3 Future Work
Apply causal inference to correct exposure bias and other systematic biases.
Integrate ZhaiZhai’s knowledge graph with causal methods for more targeted business guidance.
References
[1] Causal Inference in Recommender Systems: A Survey and Future Directions
[2] https://www.bradyneal.com/causal-inference-course
[3] Model‑Agnostic Counterfactual Reasoning for Eliminating Popularity Bias in Recommender System
[4] 因果推断推荐系统工具箱 - MACR:https://www.jianshu.com/p/ffed9c9260e3
[5] 推荐系统流行度偏差专题:https://zhuanlan.zhihu.com/p/613111042
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.