Design and Implementation of Ctrip Real‑Time User Data Collection System
This article describes the design, technology selection, and performance evaluation of Ctrip's real‑time user behavior data collection platform, covering Netty‑based network handling, Kafka/Hermes messaging, encryption, compression, Avro backup, and related analytics products, with detailed feasibility analysis and benchmark results.
The author, Wang Xiaobo, a senior engineer at Ctrip's framework R&D department, presents the design and implementation of a real‑time user data collection system that addresses the limitations of traditional PC‑based logging in the mobile‑first era.
The system is built on a Java stack, using Netty (a high‑performance NIO framework) for network communication and Hermes—a Ctrip‑customized Kafka‑based distributed message queue—for durable storage. It consists of five main components: a client SDK that sends data via HTTP/TCP/UDP, the Mechanic (UBT‑Collector) server, Hermes/Kafka for asynchronous processing, HBase for monitoring data, and a Dashboard for real‑time visualization.
Technical choices include selecting Netty over alternatives (MINI, xSocket, Nginx) after performance testing, leveraging its rich protocol support, asynchronous I/O, and extensibility. Netty’s three‑layer architecture (Reactor, Pipeline, Business Logic) is explained, along with its threading models (single‑reactor, multi‑reactor, master‑slave) and serialization options (Protobuf, Avro, Thrift).
For data security, three encryption strategies are discussed: embedding keys in compiled native libraries, retrieving keys via HTTPS from a server, and using asymmetric encryption to exchange a temporary symmetric key. Compression is handled with GZIP or a custom LZ77 algorithm.
Hermes, built on Kafka, provides high‑throughput, low‑latency messaging with support for MySQL, Kafka, and a custom broker storage. The design ensures per‑user ordering by routing a user’s events to the same Kafka partition.
Disaster‑recovery uses Avro container files stored locally when Hermes/Kafka is unavailable; a background thread later replays these files into the message queue and deletes them upon success.
Feasibility analysis includes benchmark tests comparing Netty and Nginx (both handling >46k requests/sec under 5k concurrency) and measuring end‑to‑end latency for data parsing and storage, confirming that the system can process ~30k requests/sec with 99% latency under 800 ms.
The article also outlines related analytics products built on the collected data, such as single‑user browsing tracking, page conversion rates, user flow analysis, click heatmaps, data validation tools, and system performance reports, all supporting multi‑platform SDKs (iOS, Android, Web, Hybrid, RN, Mini‑Programs).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
