Real-Time Detection of E‑commerce Search Advertising Errors Using a Big Data Processing Pipeline
This article describes how an e‑commerce platform built a big‑data pipeline—including TTLog, Lindorm, BCP, MetaQ, Blink and Xflush—to detect and verify search advertising placement errors in real time, covering background, implementation details, core challenges, and future optimization directions.
The article introduces the problem of long and complex data processing chains in e‑commerce search advertising, where any node failure or delay can cause financial loss for advertisers and the platform. Traditional testing lacks comprehensive coverage, and existing business monitoring focuses only on performance metrics.
To address this, the team designed an online real‑time detection system that traces every exposed ad back to its last known state in the database before exposure, enabling immediate identification of mismatched ad statuses across the entire pipeline.
Stage Results : By leveraging TTLog, Lindorm (a NoSQL store based on HBase), BCP, MetaQ, the real‑time synchronization service Jingwei, and the Xflush analytics platform, the solution achieved full‑traffic coverage for error detection and provided a real‑time quality dashboard for the ICBU advertising system.
Technical Implementation :
1. Engine exposure log processing : TTLog collects logs from all search nodes, BCP cleans and samples the stream, and the data is pushed to MetaQ for downstream verification.
2. Database handling : MySQL stores the latest state of each business object. Jingwei captures every DB change and writes snapshots to Lindorm for fast random reads.
3. Data consistency verification : The igps service consumes messages from MetaQ, extracts the ad’s exposure state, queries Lindorm for the corresponding pre‑exposure DB state, and compares the two. Inconsistencies are logged and aggregated via Xflush for monitoring and alerting.
4. Proactive verification at change nodes : In addition to reactive checks, the system actively queries the engine when DB changes occur or when index switches happen, reusing the same verification pipeline.
The monitoring dashboard visualizes error spikes caused by DB sync delays, index switches, or logic bugs, covering entities such as campaigns, ad groups, customers, keywords, and feeds across exposure and click stages.
Core Issues Discussed :
Why Lindorm was chosen: to overcome MySQL performance bottlenecks when handling billions of rows, achieving query latency reduction from ~1 s to ~70 ms.
Why BCP + MetaQ + igps were used: to decouple processing, reduce HSF call overhead, and maintain low CPU usage under high sampling rates.
Why not replace everything with Blink: because certain low‑traffic checks are more conveniently handled directly in BCP, and Blink still exhibits stability concerns.
How to split dynamic SP request keys: using Blink’s UDTF to parse request strings into key‑value pairs formatted for Xflush grouping.
Summary and Future Plans : The solution demonstrates end‑to‑end real‑time detection of advertising data inconsistencies using big‑data technologies, with plans to expose richer real‑time dimensions and shift the testing framework left to pre‑release stages for automated functional, performance, and effect validation across the full search‑ad pipeline.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
