How Ctrip Boosted Hotel Data Warehouse Performance 400% with ClickHouse
Ctrip’s hotel data team tackled a 3 TB daily data load by building a ClickHouse cluster on VMware, creating custom sync and execution tools, applying query optimizations, and handling merge and memory errors, ultimately achieving over 400% performance gains across multiple reporting themes.
Background
As Ctrip’s hotel business grew, daily data volume reached about 3 TB, and Hive (Spark) could not meet the requirement for overseas services that needed data several hours earlier than domestic pipelines. In early 2020 the team began evaluating ClickHouse for data‑warehouse workloads.
Technical Research and Solution Selection
There was no existing ClickHouse cluster, so the team built one on VMware. By March 2020 four physical servers (256 GB RAM, 40 CPU cores, 3.5 TB disks) were ready. To ensure a smooth production transition, data were first synchronized from the Hive ODS layer to a ClickHouse ODS layer.
Integration Development Environment Packaging
1. Data‑Sync Tool
The original sync process consumed many times more time than the actual computation. The team adopted ClickHouse’s ORC import capability:
cat filename.orc | clickhouse-client --query="INSERT INTO some_table FORMAT ORC"They wrapped this command into a script orc2ck.sh that adds caching and batch processing, boosting sync speed by more than 500%.
2. Execution Tool
To reduce code duplication and improve developer efficiency, a wrapper script ck.sh was created for invoking ClickHouse client commands. The execution environment and parameters are illustrated in the accompanying diagram.
ClickHouse Code Optimization Techniques
1. Place Small Tables on the Right Side of JOIN
Moving small tables to the right side of a JOIN reduces memory consumption, as shown in the diagram.
2. Use IN Instead of JOIN
Replacing certain JOINs with IN clauses can significantly speed up execution.
3. Reduce Data Scanning
Adding filter logic cuts the amount of data scanned, lowering memory usage and improving speed. The team also tuned server settings such as max_memory_usage, max_bytes_before_external_group_by, and max_bytes_before_external_sort to handle large queries.
Exception Handling
Code: 252, e.displayText() = DB::Exception: Too many parts (301). Merges are processing significantly slower than inserts.The merge process is asynchronous; inserting too fast triggers this error. Recommended insert speed is about 1 million rows per second.
Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceededThis occurs when query memory exceeds the system limit. Solutions include increasing the memory quota via max_memory_usage or offloading excess data to disk using max_bytes_before_external_group_by and max_bytes_before_external_sort.
When server memory is sufficient, raise the quota with max_memory_usage.
If memory is constrained, enable external processing parameters, accepting slower execution.
If these adjustments do not resolve the issue, review and optimize the query logic as described in the optimization section.
Server Fault Handling
A forced restart during a fault‑injection test left ClickHouse unable to start because it could not load table metadata. Two remediation paths were used:
Delete or move the table’s data files (the method applied in the incident, see log screenshot).
Rebuild the table’s metadata (the more proper approach).
Conclusion
By mid‑2020, all hotel‑order themes and P1 reporting tables had been migrated to ClickHouse, achieving performance improvements of over 200% for most cases and an average increase of around 400%, effectively solving the majority of application scenarios. Future work includes ingesting more business themes and exploring Flink + ClickHouse for real‑time data‑warehouse capabilities.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
