Fundamentals 17 min read

Metric Anomaly Detection and Diagnosis Practices at NetEase Yanxuan

This article presents NetEase Yanxuan's end‑to‑end approach for automatically detecting and diagnosing metric anomalies in e‑commerce, covering background motivations, statistical detection methods (absolute, volatility, trend), contribution‑decomposition diagnosis, optimization techniques for dimensional explosion, and a Q&A on practical implementation.

DataFunTalk
DataFunTalk
DataFunTalk
Metric Anomaly Detection and Diagnosis Practices at NetEase Yanxuan

Metrics are critical for business health, and rapid, accurate detection of abnormal metrics helps identify and resolve issues promptly. NetEase Yanxuan outlines a fully automated solution that eliminates manual rule definition, supports diverse metric distributions, operates at day‑ and hour‑level granularity, and ensures high accuracy.

Background : Rapid e‑commerce iteration leads to a growing number of heterogeneous metrics, making manual threshold setting error‑prone and costly. The goal is an automated system that requires no user input, is universally applicable, timely, and accurate.

Metric Anomaly Detection : Three anomaly types are defined—absolute value anomalies (statistical outliers), volatility anomalies (sharp up/down spikes), and trend anomalies (long‑term upward or downward shifts). Detection frameworks include:

Absolute value detection using the Generalized ESD (GESD) test, which iteratively removes the most extreme points and compares test statistics to critical values derived from the t‑distribution.

Volatility detection based on volatility‑rate distribution and second‑order derivative analysis to locate inflection points.

Trend detection employing the Mann‑Kendall non‑parametric test, where a p‑value < 0.05 indicates a significant trend.

Post‑processing steps filter out redundant alerts, such as cascading volatility caused by prior absolute anomalies, and suppress alerts during known large‑scale promotional events.

Metric Diagnosis : Diagnosis is categorized into deterministic, probabilistic, and speculative inference. Deterministic diagnosis uses contribution‑decomposition methods (additive, multiplicative LMDI, and divisional) to attribute metric changes to specific dimensions, offering clear, white‑box explanations. Probabilistic methods (machine learning, SHAP, Bayesian networks) provide insights but lack precise root‑cause attribution.

The contribution‑decomposition formulas break down a target metric Y (e.g., GMV) into dimension values Xᵢ, calculating additive ΔXᵢ/Y₀, multiplicative LMDI factors, or divisional components (volatility contribution Aᵢ and structural contribution Bᵢ). The additive method is emphasized for NetEase Yanxuan due to its interpretability.

Dimensional Explosion Optimization : To avoid exponential growth of intermediate tables, the workflow is transformed to aggregate contributions after a single fine‑grained computation, followed by grouping. Additional optimizations include pruning dimension combinations based on hierarchical relationships and limiting the number of combined dimensions, as well as ranking dimensions by a Gini‑coefficient‑based metric to pinpoint the most influential factors.

QA : The Q&A clarifies evaluation of diagnostic accuracy (using deterministic validation and bad‑case collection) and discusses when to mix additive and multiplicative decomposition, concluding that additive decomposition remains the primary approach for the current business scenario.

statistical methodsdiagnostic analysismetric anomaly detectione-commerce analyticscontribution decomposition
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.