How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka
This article details Ctrip's design and implementation of a high‑throughput, low‑latency user data collection platform that leverages Java NIO, Netty, and a custom Kafka‑based messaging layer, covering architecture, encryption, compression, disaster‑recovery, performance testing, and downstream analytics products.
Introduction
With the rise of mobile internet, traditional PC‑based log collection can no longer meet the demands for real‑time user behavior analysis, traffic statistics, and location‑based services. Ctrip therefore designed a real‑time, high‑throughput user data collection system to address latency, throughput, and terminal coverage challenges.
Technical Selection and Design
The system is built on the Java NIO network framework Netty and the distributed message queue Kafka (rebranded as Hermes). It provides real‑time processing, high throughput, and broad protocol support.
Netty Network Framework
Netty was chosen after benchmarking against alternatives (MINI, xSocket) and Nginx. It offers asynchronous, non‑blocking I/O, multi‑protocol support, rich codec features, and excellent performance. Netty’s three‑layer architecture consists of the Reactor scheduling layer, the Pipeline chain layer, and the business logic layer.
Client‑Side Encryption and Compression
To protect sensitive data, three encryption approaches are proposed: embedding keys in compiled native libraries, fetching keys via HTTPS, and using asymmetric encryption to exchange a temporary symmetric key. Compression uses standard GZIP or a custom LZ77 algorithm to reduce bandwidth.
Hermes (Kafka) Storage Solution
Hermes, based on open‑source Kafka, stores collected data. It supports MySQL for moderate volumes, Kafka for large‑scale streams, and a custom broker file system for extended storage. Kafka provides O(1) persistence, high throughput, partitioned ordering, and horizontal scalability.
Disaster‑Recovery with Avro
When network or Hermes failures occur, data is serialized to Avro files on local disk, partitioned by Kafka topics and hourly rotated. Once the system recovers, a background thread reads the Avro files and writes the data back to the appropriate Kafka partitions, then deletes the files.
Feasibility and Performance Tests
Benchmarking on identical test servers compared Netty and Nginx handling of 5,000 concurrent keep‑alive requests, showing comparable request rates (~46k req/s) with Netty slightly higher latency. End‑to‑end tests demonstrated that Netty‑based data collection, parsing, and writing to Hermes can process ~30k requests per second while keeping 99% of requests under 800 ms.
Related Data Analysis Products
Based on the collected data, Ctrip offers several analytics products, including single‑user browsing tracking, page conversion rates, user flow analysis, click heatmaps, data validation tools, and system performance dashboards, all supporting multi‑platform SDKs (iOS, Android, Web, Hybrid, RN, Mini‑Programs).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
