Artificial Intelligence 19 min read

Causal Inference‑Based Attribution Methods in Feizhu Advertising Diagnosis System

This article introduces Feizhu's advertising diagnosis platform and explains how recent causal inference techniques, especially the NO TEARS algorithm and Bayesian‑network‑based attribution, are applied to identify the root causes of performance fluctuations across the ad delivery funnel, improve diagnostic accuracy, and guide optimization decisions.

DataFunTalk
DataFunTalk
DataFunTalk
Causal Inference‑Based Attribution Methods in Feizhu Advertising Diagnosis System

The article begins with an overview of Feizhu's advertising diagnosis system, which helps advertisers quickly detect and resolve issues in ad delivery by monitoring key metrics such as exposure, ROI, and spend across the entire funnel.

It then describes the motivation for adopting causal inference: traditional manual diagnostics are time‑consuming and rely heavily on expert experience, while the rapid growth of ad campaigns demands an automated method to pinpoint the layers (recall, targeting, allocation, bidding, ranking, etc.) that cause metric deviations.

Two attribution approaches are presented. The first is a causal‑discovery method based on the NO TEARS algorithm, which formulates the construction of a directed acyclic graph (DAG) as a smooth, equality‑constrained optimization problem. The method uses an augmented Lagrangian loss and enforces four conditions to guarantee DAG‑ness, smoothness, and computational tractability.

Data preprocessing for NO TEARS is highlighted: instead of normalizing absolute funnel values, the system uses relative changes between the current timestamp and the previous one, because relative variations better capture causal relationships in heterogeneous industrial data.

The second approach builds a Bayesian network on the learned causal graph. After constructing the DAG with NO TEARS, known business knowledge is injected to refine the structure. Variables are discretized (e.g., consumption changes into five buckets), and do‑calculus interventions are performed on each factor to compute its causal effect on the target metric. Upstream nodes are also intervened to capture indirect influences, and posterior probabilities are used to adjust for confounding effects.

Advantages of the combined method include higher attribution accuracy (≈85 %), better interpretability through explicit probability tables, and flexibility to adjust the attribution logic when results are unsatisfactory. Limitations are noted: the NO TEARS optimization can be slow, the model behaves like a black box, and the approach is sensitive to noisy industrial data, requiring careful thresholding of low‑weight edges.

A Q&A section addresses practical concerns such as handling categorical variables, incorporating business priors into the graph learning process, dealing with cycles, and evaluating attribution accuracy, revealing that many decisions are still guided by expert judgment.

The article concludes by summarizing the overall workflow: use T‑N day data to train a causal‑discovery model (output I), construct a Bayesian network (output II), combine with rule‑based judgments (output III), and fuse the three results to make final attribution decisions, while acknowledging that further research is needed.

causal inferenceAd AttributionBayesian NetworkAdvertising DiagnosisNO TEARS
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.