Optimizing ClickHouse Performance in WeChat: Observation Tools, Lakehouse Reading, Bitmap Acceleration, and AI Integration
This article details how the WeChat team leverages ClickHouse at massive scale, introduces a suite of performance observation tools, describes lakehouse reading and bitmap optimizations, and explains the integration of AI workloads, demonstrating overall query speedups of up to tenfold across diverse scenarios.
ClickHouse is widely used within WeChat for real‑time reporting, AB testing, and online analytics, handling thousands of nodes, millions of daily queries, and achieving sub‑second response times. The team identified performance bottlenecks and built a comprehensive observation toolbox that includes Query Log, Query Thread Log, Sampling Query Profiler, Flame Graph, Processors Profile Log, and a custom Profile Engine for pre‑ and post‑analysis.
To address new use cases, the team explored lakehouse reading by integrating Iceberg/Hive, reducing data silos, and supporting both batch and streaming workloads. Challenges such as ClickHouse’s single‑node Hive support, limited Iceberg S3 compatibility, and metadata synchronization were tackled by adding an external HTTP‑based Iceberg API server, hash‑based file distribution, caching of metadata, and decoupling compute from storage.
Bitmap‑based acceleration was introduced for experiment analysis and user profiling, converting set operations to bitmap manipulations. To mitigate data skew, a repartition stage was added, and row‑level parallelism with asynchronous deserialization was implemented, yielding 10‑20% performance gains and up to 100× speedup for large bitmap unions and intersections.
In the AI scenario, the team rebuilt the entire algorithm pipeline on ClickHouse, using high‑dimensional embeddings for similarity search. They added a NormalizedCosineDistance function, optimized embedding vector distance calculations, and rewrote queries with CTEs and pre‑filters, achieving up to 4× speedup. Additional improvements included ZSTD compression, repartitioning, and dictionary encoding.
Overall, the combined optimizations—observation tooling, lakehouse reading enhancements, bitmap acceleration, and AI‑focused query rewrites—resulted in 5‑10× performance improvements in typical workloads and enabled ClickHouse to serve as a unified OLAP engine for both analytical and AI workloads.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.