How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka

This article details Ctrip's design and implementation of a high‑throughput, low‑latency user data collection platform that leverages Java NIO, Netty, and a custom Kafka‑based messaging layer, covering architecture, encryption, compression, disaster‑recovery, performance testing, and downstream analytics products.

21CTO
21CTO
21CTO
How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka

Introduction

With the rise of mobile internet, traditional PC‑based log collection can no longer meet the demands for real‑time user behavior analysis, traffic statistics, and location‑based services. Ctrip therefore designed a real‑time, high‑throughput user data collection system to address latency, throughput, and terminal coverage challenges.

Technical Selection and Design

The system is built on the Java NIO network framework Netty and the distributed message queue Kafka (rebranded as Hermes). It provides real‑time processing, high throughput, and broad protocol support.

Netty Network Framework

Netty was chosen after benchmarking against alternatives (MINI, xSocket) and Nginx. It offers asynchronous, non‑blocking I/O, multi‑protocol support, rich codec features, and excellent performance. Netty’s three‑layer architecture consists of the Reactor scheduling layer, the Pipeline chain layer, and the business logic layer.

Client‑Side Encryption and Compression

To protect sensitive data, three encryption approaches are proposed: embedding keys in compiled native libraries, fetching keys via HTTPS, and using asymmetric encryption to exchange a temporary symmetric key. Compression uses standard GZIP or a custom LZ77 algorithm to reduce bandwidth.

Hermes (Kafka) Storage Solution

Hermes, based on open‑source Kafka, stores collected data. It supports MySQL for moderate volumes, Kafka for large‑scale streams, and a custom broker file system for extended storage. Kafka provides O(1) persistence, high throughput, partitioned ordering, and horizontal scalability.

Disaster‑Recovery with Avro

When network or Hermes failures occur, data is serialized to Avro files on local disk, partitioned by Kafka topics and hourly rotated. Once the system recovers, a background thread reads the Avro files and writes the data back to the appropriate Kafka partitions, then deletes the files.

Feasibility and Performance Tests

Benchmarking on identical test servers compared Netty and Nginx handling of 5,000 concurrent keep‑alive requests, showing comparable request rates (~46k req/s) with Netty slightly higher latency. End‑to‑end tests demonstrated that Netty‑based data collection, parsing, and writing to Hermes can process ~30k requests per second while keeping 99% of requests under 800 ms.

Related Data Analysis Products

Based on the collected data, Ctrip offers several analytics products, including single‑user browsing tracking, page conversion rates, user flow analysis, click heatmaps, data validation tools, and system performance dashboards, all supporting multi‑platform SDKs (iOS, Android, Web, Hybrid, RN, Mini‑Programs).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureKafkaencryptionData StreamingAvroreal-time data collection
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.