How Ctrip Scaled Its Vacation Product Log System to Billions of Records
This article recounts the evolution of Ctrip's vacation product log platform—from a single‑table DB solution to a platformized ES + HBase architecture—detailing the challenges of massive data volume, the design of RowKey, write and query flows, and the subsequent business and supplier empowerment.
Development Trajectory
The vacation product system at Ctrip handles extremely complex product structures, generating over 6 billion daily data change records across thousands of tables, accumulating more than 1.7 trillion log entries.
Evolution Process
V1.0 – DB Single‑Table Storage
Before 2019, logs were stored in a single MySQL table with columns id and LogContent. Log content was unstructured text, queried via LIKE statements. This caused massive table size (over 1 billion rows, ~370 GB), poor performance, low readability, tight coupling with business code, and limited extensibility.
V2.0 – Platformization
To address V1.0 limitations, a unified log platform was built. After evaluating solutions (ES + HBase, MongoDB, ClickHouse), the team chose ES + HBase for its ability to handle petabyte‑scale data with strong search capabilities.
The architecture uses HBase for durable storage and Elasticsearch for fast search. Logs are ingested via an API, placed onto a message queue (MQ), and processed asynchronously, decoupling write latency from business operations.
RowKey Design
RowKey follows five components to ensure uniqueness, hash distribution, ordering, compactness, and readability: MD5‑derived PK prefix, zero‑padded tableId, PK with random suffix, log type padding, and timestamp.
Write Flow
Clients call the log API; the service pushes the request to MQ. Consumers generate the RowKey, write the raw log to HBase, then index the document in Elasticsearch. Failure in either store triggers a fallback to a Redis compensation cluster for retry.
Query Flow
Clients invoke the query API; the service translates parameters into an Elasticsearch paginated request, retrieves matching RowKeys, and batch‑fetches full log content from HBase.
V3.0 – Business Empowerment
To reduce reliance on developers, a B‑side log query portal was created for suppliers and business users. Logs are transformed into user‑friendly formats, supporting text fields, data association, enumeration mapping, bit‑wise decoding, field combination, external API enrichment, and diff comparison.
Text field logs are displayed directly.
Data‑association logs resolve foreign keys to readable names.
Enumeration logs map codes to descriptive values.
Bit‑storage logs decode bit‑wise encoded numbers.
Field‑combination logs merge related fields for a holistic view.
External‑interface logs call services (e.g., city ID to city name).
Diff‑comparison logs highlight changes between snapshots.
All transformation rules are configurable, allowing rapid onboarding of new log types without code changes.
Conclusion
The vacation product log platform evolved from a simple DB table to a scalable ES + HBase solution, achieving sub‑500 ms query latency on trillion‑level data. By exposing a B‑side portal and flexible configuration, the system now serves multiple business lines, suppliers, and developers, reducing troubleshooting effort and supporting continued growth.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
