How Ctrip Scaled Its Vacation Product Log System to Billions of Records

This article recounts the evolution of Ctrip's vacation product log platform—from a single‑table DB solution to a platformized ES + HBase architecture—detailing the challenges of massive data volume, the design of RowKey, write and query flows, and the subsequent business and supplier empowerment.

dbaplus Community
dbaplus Community
dbaplus Community
How Ctrip Scaled Its Vacation Product Log System to Billions of Records

Development Trajectory

The vacation product system at Ctrip handles extremely complex product structures, generating over 6 billion daily data change records across thousands of tables, accumulating more than 1.7 trillion log entries.

Evolution Process

V1.0 – DB Single‑Table Storage

Before 2019, logs were stored in a single MySQL table with columns id and LogContent. Log content was unstructured text, queried via LIKE statements. This caused massive table size (over 1 billion rows, ~370 GB), poor performance, low readability, tight coupling with business code, and limited extensibility.

V2.0 – Platformization

To address V1.0 limitations, a unified log platform was built. After evaluating solutions (ES + HBase, MongoDB, ClickHouse), the team chose ES + HBase for its ability to handle petabyte‑scale data with strong search capabilities.

The architecture uses HBase for durable storage and Elasticsearch for fast search. Logs are ingested via an API, placed onto a message queue (MQ), and processed asynchronously, decoupling write latency from business operations.

Overall architecture diagram
Overall architecture diagram

RowKey Design

RowKey follows five components to ensure uniqueness, hash distribution, ordering, compactness, and readability: MD5‑derived PK prefix, zero‑padded tableId, PK with random suffix, log type padding, and timestamp.

RowKey composition
RowKey composition

Write Flow

Clients call the log API; the service pushes the request to MQ. Consumers generate the RowKey, write the raw log to HBase, then index the document in Elasticsearch. Failure in either store triggers a fallback to a Redis compensation cluster for retry.

Write process diagram
Write process diagram

Query Flow

Clients invoke the query API; the service translates parameters into an Elasticsearch paginated request, retrieves matching RowKeys, and batch‑fetches full log content from HBase.

Query process diagram
Query process diagram

V3.0 – Business Empowerment

To reduce reliance on developers, a B‑side log query portal was created for suppliers and business users. Logs are transformed into user‑friendly formats, supporting text fields, data association, enumeration mapping, bit‑wise decoding, field combination, external API enrichment, and diff comparison.

Text field logs are displayed directly.

Data‑association logs resolve foreign keys to readable names.

Enumeration logs map codes to descriptive values.

Bit‑storage logs decode bit‑wise encoded numbers.

Field‑combination logs merge related fields for a holistic view.

External‑interface logs call services (e.g., city ID to city name).

Diff‑comparison logs highlight changes between snapshots.

All transformation rules are configurable, allowing rapid onboarding of new log types without code changes.

Conclusion

The vacation product log platform evolved from a simple DB table to a scalable ES + HBase solution, achieving sub‑500 ms query latency on trillion‑level data. By exposing a B‑side portal and flexible configuration, the system now serves multiple business lines, suppliers, and developers, reducing troubleshooting effort and supporting continued growth.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendarchitectureScalabilityElasticsearchHBaselog system
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.