How Ximalaya Scaled Its Gateway to 200B Daily Calls: Architecture & Optimizations
This article details Ximalaya's evolution of its HTTP gateway—from a Tomcat NIO prototype to a fully asynchronous Netty design—covering architectural diagrams, performance bottlenecks, traffic management features, monitoring, GC tuning, and future plans for HTTP/2 and graceful degradation.
Background
Gateways are mature middleware used by major internet companies to handle public business features and enable rapid updates. Ximalaya’s gateway processes 200 billion daily calls, with peak QPS over 40 k, supporting 500+ web services for over 600 million users.
First Version: Tomcat NIO + AsyncServlet
Initial design used Tomcat with NIO and AsyncServlet. The architecture required a separate Push layer and HttpNioClient for backend communication. However, performance bottlenecks appeared at ~5k QPS due to Tomcat’s request caching, memory copy, and blocking body reads.
Tomcat issues
Excessive object pooling causing GC pressure.
Heap memory copy between Netty and Tomcat.
Blocking body read in NIO model.
HttpNioClient issues
Lock contention on connection acquire/release.
Second Version: Netty + Full Asynchronous
Switching to Netty eliminated the above problems, delivering a fully asynchronous, lock‑free, layered architecture.
Access Layer
Netty I/O threads handle HTTP codec and protocol‑level monitoring. Optimized request/response size limits and attack detection provide immediate 400 responses for oversized requests.
Business Logic Layer
Implements public features such as authentication, black/white lists, flow control, intelligent circuit breaking, gray release, unified degradation, traffic scheduling, traffic copy, and request sampling using a filter‑based design.
Service Call Layer
Asynchronous service calls use Netty’s connection pool with lock‑free operations. A timeout mechanism starts after successful flush to avoid unfair timing.
Full‑Link Timeout Mechanism
Handles protocol parsing, queue waiting, connection establishment, link waiting, pre‑write timeout, write timeout, and response timeout.
Monitoring & Alarm
Provides second‑level monitoring and alerts for protocol‑level attacks, oversized requests, latency, QPS, bandwidth, response codes, link health, failure rates, and traffic jitter, storing data in InfluxDB.
Performance Optimizations
Object Pooling : Reuse frequently used objects (e.g., thread‑pool tasks, StringBuffer) to reduce allocation and GC pressure.
Context Switching : Asynchronous design reduces thread switches; tuning showed a 20 % reduction in CPU context switches when pushing through Netty I/O threads.
GC Tuning : Large young generation, SurvivorRatio = 2, max tenuring age = 15, and careful handling of socket finalizers to prevent old‑generation growth.
/**
* Cleans up if the user forgets to close it.
*/
protected void finalize() throws IOException {
close();
}Logging : Avoid synchronous console appender flushing; use asynchronous logging to prevent I/O thread blockage.
Future Plans
Upgrade to HTTP/2 for multiplexed connections, continue improving monitoring accuracy, and enhance graceful degradation mechanisms to ensure site‑wide reliability.
Conclusion
The gateway has become a standard component in large‑scale internet services. The article shares practical insights and ongoing improvements, inviting interested engineers to join the project.
Source: https://www.jianshu.com/p/165b1941cdfa
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
