How Baidu Boosted Search Push Clicks with Model Calibration and DeltaCTR Strategies
This article details Baidu Search's personalized push system, covering challenges in material selection and user targeting, the end‑to‑end workflow, model accuracy improvements, pCTR calibration techniques, deltaCTR‑based ranking, and the combined offline‑online experiments that significantly raised both CTR and DAU.
Background
Baidu Search aims to provide fast, accurate information retrieval, and its Search Push feature delivers timely content to users via query‑based notifications, which are more precise than traditional feed pushes and help increase search DAU.
Challenges
Despite a large push ecosystem, Search Push accounts for a small share of total push volume and clicks, requiring better material quality and more accurate user recommendation.
Overall Solution Design
The workflow consists of material production, Search Push strategy, and push delivery, all tightly linked.
1. Material Production
High‑quality materials are filtered from abundant data using a comprehensive screening mechanism that evaluates content quality, user demand, and compliance, including manual review and LLM‑based rewriting.
2. Search Push Strategy
Efficient algorithms match users with relevant content by analyzing user profiles, interests, and behavior, producing precise recommendations.
3. Push Channel
A stable, high‑performance channel is built to deliver <ID, ITEM> pairs after coarse and fine ranking, selecting the highest‑scoring material for delivery.
Model Accuracy Improvements
The model evolved from a dual‑tower architecture to a fully connected network, incorporating richer features and larger parameter counts using PaddlePaddle, and finally adding a deltaCTR bias to balance predicted pCTR with user‑specific uCTR.
Key Features
itemid feature : captures fine‑grained article information.
user id feature : modeled with long‑term samples to handle sparsity.
push scene portrait and sequence : enhances understanding of user behavior and boosts DAU.
pCTR Calibration
Negative sampling introduces bias between predicted pCTR and true CTR, so calibration is needed. Two main methods are used:
Negative‑sample‑rate adjustment based on Facebook’s CTR correction formula.
Isotonic regression (Smoothed Isotonic Regression) that preserves the monotonic order of pCTR while smoothing the calibration curve.
Calibration steps:
Sort pCTR values into K buckets (e.g., 100 buckets).
Compute the average actual CTR for each bucket.
Merge adjacent buckets when the actual CTR falls outside the bucket range to maintain order.
Fit a piecewise calibration function across bucket boundaries to obtain smooth, monotonic adjustments.
DeltaCTR Strategy Design
After ranking materials by post‑hoc CTR, the top‑K items are scored, and pCTR is calibrated. The deltaCTR (pCTR − uCTR) is computed to estimate incremental click value. Items are then ranked by deltaCTR to select the optimal audience, balancing revenue and DAU growth.
Parameter Optimization
Two parameters (a and b) control the weighting of pCTR and deltaCTR. Offline analysis explores candidate (a, b) pairs, and online experiments validate the best combinations. The optimization seeks to keep experimental and control group CTR equal while maximizing total clicks.
Offline steps include:
Calculate model‑predicted CTR (pCTR).
Compute user activity clusters and actual uCTR.
Handle unreliable uCTR values.
Derive deltaCTR.
Sort pCTR and deltaCTR into separate queues.
Search for optimal (a, b) that maximizes clicks under constraints.
Results
Three experimental configurations were tested. All showed significant DAU improvements; the configuration with the highest deltaCTR yielded the greatest DAU lift, while another achieved the largest CTR gain. Ongoing model iterations and calibration refinements continue to push performance higher.
Conclusion
The Baidu Search Push personalization pipeline—covering material selection, model upgrades, pCTR calibration, and deltaCTR‑driven ranking—successfully increased both click‑through rate and daily active users. Future work will focus on further model and parameter optimization to sustain growth.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
