How Kuaishou Overcame the ‘Impossible Triangle’ of Performance, Flexibility, and Cost in Real‑Time Big Data Analytics
This article details how Kuaishou’s content middle platform tackled the massive challenges of real‑time, flexible, and cost‑effective data analysis at trillion‑scale by redesigning its architecture, adopting ClickHouse, splitting wide tables, and implementing a scatter‑gather execution model with pre‑shuffle and bitmap optimizations.
Project Background
Kuaishou’s content middle platform integrates content retrieval, diagnostics, and management, serving internal product, operation, and analysis teams with real‑time and hourly video and live‑stream data. The system consists of four core modules: retrieval, diagnostics, content pool, and foundational services.
Challenges
The platform faced a “impossible triangle” of high performance, high flexibility, and low cost while handling daily new content exceeding 40 million items and video play counts over 100 billion, requiring queries over half‑year data and 30+ consumption metrics.
Data volume: massive daily ingestion and long‑term storage.
Flexibility: need for arbitrary time, product, and page filters with dynamic ranking.
Cost: petabyte‑scale SSD storage and high‑throughput compute.
Technical Selection
After evaluating Elasticsearch and ClickHouse, ClickHouse was chosen as the primary analytical engine for its columnar storage, compression, and aggregation capabilities, especially for OLAP workloads.
Table Design – Solving Cost & Performance
To avoid the drawbacks of a single ultra‑wide table, a hybrid approach combining vertical partitioning (splitting columns into multiple tables) and vertical tables (storing key‑value pairs as rows) was adopted. This reduced storage bloat, improved query speed, and allowed independent maintenance of hot and cold data.
Query Performance Challenges
After table splitting, queries became more complex, involving dozens of tables and billions of rows. High‑cardinality aggregations (e.g., grouping by photo_id with billions of distinct values) and distributed joins caused memory pressure and network shuffling.
select sum(play_cnt) as order_cnt, photo_id
from photo_behav
where p_date between '20250101' and '20250103'
and product = '极速版'
group by photo_id
order by order_cnt desc
limit 100;Optimization Strategies
Data Pre‑Shuffle : During ETL, data is written with a distribution key so that rows sharing the same key land on the same ClickHouse node, enabling colocated queries without cross‑node shuffling.
Scatter‑Gather Execution Model : Queries are split (scatter) to worker nodes, executed locally, and results are gathered and merged, reducing network traffic.
Bitmap Aggregation : Using ClickHouse’s groupBitmap64State function to build Roaring Bitmaps for high‑cardinality sets, dramatically lowering memory usage.
Results
Performance improved from >20 seconds to <5 seconds for P80 queries, and required cluster size dropped from ~200 to ~70 machines. The hybrid table design and execution optimizations achieved a balanced solution to the performance‑flexibility‑cost triangle.
Conclusion
The case demonstrates that with careful data modeling, ClickHouse selection, and manual control over query execution, large‑scale real‑time analytics can meet stringent business requirements while keeping costs manageable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Kuaishou Tech
Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
