Cold Start Optimization for New Content in Autohome Recommendation System
The article details how Autohome tackled the cold‑start problem for newly generated content by redesigning the recommendation pipeline, introducing multi‑path recall, refining ranking and re‑ranking formulas, and applying strategic controls, resulting in a rise of cold‑start success rate from 27% to over 99% and a CTR increase from 5% to 14%.
1. Material Cold‑Start Definition The recommendation system aims to solve information overload and precise content delivery. At Autohome, tens of thousands of new items appear daily, but lack user‑item interaction data, making it hard to target users and leading to slow distribution and poor performance.
2. Material Cold‑Start Optimization
2.1 Link Optimization New items pass through stages such as tag generation, real‑time indexing, recall, ranking, engine re‑ranking, and exposure. Real‑time guarantees for each stage, content distribution limits, streamlined ranking paths, and effective re‑ranking formulas are essential to reduce cold‑start latency.
2.2 Algorithm Optimization This includes:
Recall : Seven cold‑start recall routes (tag, similarity, supplement) using multi‑path strategies and reinforcement learning; similarity recall leverages OpenAI’s text‑embedding‑ada‑002 and FAISS for vector search.
Fine‑ranking Model : Applies dropout on statistical features and time‑decay weighting to give new items higher loss, encouraging the model to rank them higher.
Engine Re‑ranking : Uses a revised weighting formula where rank_score (model score) and weight (adjustment factor) produce a final score , limiting the weight range to 0‑100 for stable control.
2.3 Strategy Optimization Strategies include volume control, feature coverage, periodic strategy polling, weight limits, recall limits, audience restrictions, and various cold‑start patches to handle author level changes, timing, and pool entry delays.
3. Effects and Outlook Since mid‑2022, cold‑start success rose from ~27% to >99% and CTR from 5% to >14%. Future work may broaden the definition of new items (e.g., 3‑5 days), fuse real‑time and offline features, and further model‑level optimizations.
HomeTech
HomeTech tech sharing
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.