Multi‑Stage Funnel Architecture and Optimization Practices in an Advertising Engine
The advertising engine uses a five‑stage funnel—retrieval, recall, coarse ranking, fine ranking, and re‑ranking—each optimized with specialized indexes, multi‑channel recall, multi‑objective twin‑tower models, deep CTR/CVR predictors, and cold‑start paths, delivering up to 33 % spend growth, 6 % eCPM lift and lower latency while maintaining diversity.
Advertising engines are the core of performance‑based ad monetization. Their task is to select the highest‑value ad(s) from a library of millions for each request and to provide an accurate eCPM estimate for billing. Directly sorting the entire library is infeasible due to compute and latency constraints, so a cascade (funnel) architecture is used to progressively filter candidates while increasing algorithmic complexity.
The funnel consists of five layers: retrieval, recall, coarse ranking, fine ranking, and re‑ranking. Retrieval scans the full library with low‑cost matching (e.g., Boolean rules) to produce a high‑value candidate set. Recall narrows this to about 100 k items, using multi‑channel approaches (I2I, U2I, etc.) to improve relevance and incorporate explanatory signals. Coarse ranking reduces the set to a few thousand, typically with simple linear models or cached twin‑tower deep models. Fine ranking selects the top ads, predicts eCPM with deep learning‑based CTR/CVR models, and applies a bidding score that blends eCPM, relevance, user priors, and other constraints. Re‑ranking shuffles similar ads in multi‑slot feeds to ensure diversity.
Optimization practices are described for each layer. In retrieval, a “cold‑hot‑warm‑zero” index replaces a single index, allocating compute resources based on ad lifecycle stage; this yields a +3 % eCPM lift and a 7.5 % latency reduction. Recall adopts a multi‑channel framework that isolates different relevance goals (cold‑start, contextual, item‑CF) to improve explainability and extensibility.
Coarse ranking faces the challenge of handling millions of candidates with limited compute. A unified multi‑objective strategy (“exclusive + preferred”) balances three business goals—external‑loop performance ads, internal‑loop native ads, and time‑constrained host ads—plus a shared queue. After deployment, self‑served ads saw a +5 % eCPM increase. The coarse‑ranking model evolved from linear logistic regression to single‑objective LTR twin‑tower models, and finally to multi‑objective twin‑tower models for CTR and CVR.
Fine ranking focuses on the most accurate eCPM prediction and balanced bidding. It incorporates deep CTR/CVR estimators, real‑time negative‑feedback gating, commercial‑value gating, and a system‑wide optimizer that automatically allocates budget across dozens of ad slots. This system‑wide selection raised advertiser spend by +33 % and increased eCPM, with a 133 % rise in active accounts per auction.
Cold‑start optimization spans the entire funnel. A dedicated “new index” in retrieval creates a separate path for fresh ads, guaranteeing exposure. Quotas are assigned at plan and impression levels to control exploration cost, delivering a +6 % eCPM lift. In recall, offline‑mined high‑quality audiences are preferentially recalled, boosting cold‑start CTCVR by +16.7 %. In fine ranking, pre‑ranking with pCVR gating and weighted scoring (considering ad type, predicted CTR/CVR, and recall signals) ensures balanced exposure and quality.
The overall analysis shows that a multi‑stage funnel is essential for balancing compute resources and business effectiveness in high‑QPS, low‑latency ad serving. Local optimizations at each layer must be coordinated to avoid conflicts between local and global objectives, and diversity mechanisms are crucial to prevent monopolization of the ranking queue.
Ximalaya Technology Team
Official account of Ximalaya's technology team, sharing distilled technical experience and insights to grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.