Big Data 18 min read

Building a Traffic and Event‑Tracking System at NetEase Yanxuan: Tagging, Management, Attribution, and Quality Assurance

This article details how NetEase Yanxuan designed and implemented a comprehensive traffic system—including event‑tagging methods, a top‑down management framework, data‑quality controls, testing strategies, and attribution models—to turn fragmented user behavior into actionable e‑commerce insights.

DataFunTalk
DataFunTalk
DataFunTalk
Building a Traffic and Event‑Tracking System at NetEase Yanxuan: Tagging, Management, Attribution, and Quality Assurance

Introduction In China’s massive user base, traffic, users, and capital are tightly coupled; as the internet matures, e‑commerce must extract more value from each user, making traffic construction and behavior analysis essential. The article shares NetEase Yanxuan’s experience building a traffic system from zero.

Tagging System Construction The core of traffic construction is a robust tagging (埋点) system. Three common tagging approaches are described: code tagging, visual tagging, and full (all‑event) tagging, each with advantages and drawbacks. To balance cost, development efficiency, and data accuracy, Yanxuan adopts a hybrid solution—automated tagging based on Alibaba’s SPM + SCM models combined with selective code tagging. The automated scheme encodes events as click + page + module + slot + content parameters, enabling low‑cost, high‑precision data collection for high‑traffic pages.

Tagging Management Framework The management framework follows a top‑down design comprising five layers: rule definition, production standardization, process workflow, data‑quality assurance, and online monitoring & alerting. This structure clarifies role responsibilities and streamlines the lifecycle from definition, development, testing, release, to pre‑warning.

Rule Definition and Standardization Yanxuan abstracts user actions into events, pages, modules, parameters, and versions. By consolidating hundreds of parameters into 18 standardized fields, the ETL process becomes simpler and more reliable. A structured naming convention (e.g., "User clicks ‘Guess You Like’ product image") makes the source of each data point transparent.

Process Flow From tag definition to deployment, a task‑flow system (integrated with JIRA and email notifications) assigns responsibilities for development, QA, and release. This workflow improves cross‑team collaboration, ensures timely data availability, and prevents the common pitfall of overlooking tagging requirements during product development.

Data‑Quality Assurance Quality is the cornerstone of the tagging system. Yanxuan employs manual testing (covering multiple devices, accounts, and event types), automated testing (checking field completeness, traceability, and dependency correctness), and UI‑automation testing to reduce QA effort while maintaining high data fidelity.

Online Monitoring & Alerting After release, daily email alerts notify owners of data anomalies, ensuring that hidden bugs or regression issues are caught promptly. Visual dashboards illustrate abnormal metrics and support rapid response.

Page Guide System – User Behavior Tracking & Attribution With the tagging foundation in place, Yanxuan implements a page‑guide system to trace user journeys and attribute conversions. Two attribution models are used: last‑click (for entry pages such as home or search) and multi‑count (for downstream pages like columns, activities, and topics). The system supports full‑link transmission and three‑step FIFO transmission to capture up to three steps of user paths.

Conclusion The traffic system is only the tip of Yanxuan’s middle‑platform iceberg; the tagging and page‑guide subsystems provide deep, fine‑grained insight into user behavior, enabling rapid human‑product‑scene matching and maximizing the efficiency of product‑shelf resources.

Big Datadata qualityAttributionprocess automatione-commerce analyticsevent tagging
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.