Causal Inference–Based Intelligent Diagnosis for E‑Commerce Merchants: Practice and Key Technologies
This article presents a comprehensive overview of applying causal inference to Alibaba’s Business Intelligence platform, detailing fundamental concepts, the merchant intelligent diagnosis system architecture, key technologies such as hybrid causal network discovery (HCM) and deep attribution, and showcases the resulting operational impact.
The presentation introduces a merchant‑focused intelligent diagnosis practice built on causal inference, outlining four main parts: a brief overview of causal inference, the Business Intelligence (BI) intelligent diagnosis system, selected key technologies, and effect demonstration.
It first explains what causal inference is, defining causal relationships, how to discover them (typically via randomized experiments or observational data), and distinguishing causality from mere correlation with illustrative examples such as price influencing sales.
Fundamental research directions are described, including the potential‑outcome model (Rubin’s framework) and causal network models (Judea Pearl’s work), as well as extensions that combine causality with machine‑learning topics like causal attribution, stable prediction, and reinforcement‑learning‑based causal discovery.
The core technical contribution highlighted is a hybrid causal network (HCM) model. The pipeline consists of three steps: (1) learning a causal skeleton from mixed‑type data using a novel MRCIT independence test integrated with the PC‑stable algorithm; (2) discovering a directed acyclic graph (DAG) constrained by the skeleton and scored with a new hybrid information criterion (CVMIC); (3) pruning spurious edges with further MRCIT checks. This approach improves scalability and accuracy for large‑scale causal graphs.
Using HCM, the system constructs merchant knowledge graphs that capture causal links among promotion factors, traffic, product attributes, and business metrics, enabling downstream tasks such as multi‑touch attribution and deep attribution.
Deep attribution is further detailed with common scenarios (volatility attribution, anomaly point attribution, multi‑touch attribution) and challenges of correlation‑based methods. A new deep attribution model is proposed, comprising four stages: anomaly detection, causal order identification, Multi‑ATE effect estimation, and contribution calculation for each factor.
The BI intelligent diagnosis project’s architecture is described at three layers: product capabilities (diagnosing issues across goods, traffic, content, experience, service), system layer (building a differentiated merchant knowledge graph from effect and behavior data), and algorithmic layer (an end‑to‑end chain for problem discovery, analysis, and solution). Problem discovery uses time‑series anomaly detection; diagnosis leverages causal network discovery and deep attribution to pinpoint root causes; solution generation combines causal effect estimation, crowd‑sourced expert knowledge, and automated strategy recommendation.
Effect demonstration shows that the system processes 100,000–200,000 merchant strategies daily, surfaces specific product‑level issues with data‑backed suggestions, and presents actionable three‑part recommendations (opportunity/problem, data support, action). The talk concludes with thanks to the speaker, Liu Chunchen, a senior algorithm expert at Alibaba.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.