Optimizing Supply Chain Planning Systems: ClickHouse ReplacingMergeTree and Local Join Solutions
This article discusses solutions to system bottlenecks in supply chain planning business development, focusing on ClickHouse ReplacingMergeTree table creation, local join optimization, and real-time data synchronization between TiDB and ClickHouse to improve query performance and system stability.
This article addresses system bottlenecks in supply chain planning business development and presents solutions for improving query performance and system stability. The business currently uses TiDB for storing plan data, dimension data, and business configurations, while ClickHouse handles large volumes of historical reference data.
Initially, the system faced several challenges: long configuration activation cycles (T+1) due to offline data processing, high memory consumption from in-memory aggregation leading to frequent OOM issues, and slow query responses sometimes exceeding 10 seconds. These problems significantly impacted user experience and system stability.
The proposed solution involves synchronizing business configuration data to ClickHouse and performing joins between TiDB and ClickHouse to output results. This approach leverages SQL aggregation queries, which are significantly faster than in-memory processing - reducing 5-second in-memory logic to approximately 300ms SQL execution.
Key technical implementations include:
1. ClickHouse ReplacingMergeTree table creation with ReplicatedReplacingMergeTree engine for data deduplication. The table structure includes department IDs, salesperson information, purchase controller data, update timestamps, and deletion flags. The ORDER BY clause uses composite keys for proper data organization.
2. Local join optimization in ClickHouse, which outperforms Global Join by performing joins on each distributed node before aggregating results. This requires consistent distribution functions across tables and specific SQL syntax requirements. The solution uses a dimension table model where each node stores a complete dataset.
3. Real-time data synchronization from TiDB to ClickHouse using TiCDC, enabling near-instantaneous configuration updates.
The implementation demonstrates significant performance improvements, reducing resource consumption by over ten times in a 9-shard, 18-node cluster. Query response times are stabilized, and memory-related OOM issues are resolved. The solution provides a scalable approach for handling large-scale supply chain planning data with complex business rules and real-time requirements.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.