How Taobao Boosted Performance with HTTP3/QUIC and XQUIC: A Deep Dive
Taobao’s network team details the evolution from proprietary Slight SSL to HTTP3/QUIC via the XQUIC library, describing TNET capabilities, deployment challenges, and performance gains across shopping, transaction, upload and video scenarios, and the extensive optimizations—such as UDP probing, 0‑RTT increase, and packet handling—that delivered measurable latency reductions and higher success rates.
Introduction
The diagram below shows key milestones in the evolution of Taobao’s network protocol. In 2015, to address the slow TLS 1.2 handshake, a lightweight private encryption protocol called Slight SSL was developed, enabling 0‑RTT by combining session negotiation and data encryption in a single TCP packet. However, issues such as Wi‑Fi chain failures, TLS 1.3 support requests, and domain‑side deployment constraints highlighted the need for further evolution toward HTTP3/QUIC.
Private protocols require end‑to‑end deployment support (intrusive).
Lack of TLS 1.3 support.
Occasional network middleboxes disconnect both ends when the private protocol is used.
TNET Capability Evolution
TNET (TAOBAO NET) is a foundational network capability library that now carries over 90% of Taobao’s HTTPS traffic. It provides composable protocol stacks (SPDY, HTTP2, HTTP3, Custom, Tunnel, etc.) and network diagnostic tools (DNS, traceroute, MTU detection, ICMP ping, IPv4/IPv6 probing). The library abstracts protocol implementations behind a unified interface, allowing callers to select protocol combinations during connection establishment.
Supports HTTP requests, private protocol channels, and ACCS messaging; standard TLS is used only for overseas scenarios where Slight SSL cannot be deployed.
Provides self‑implemented DNS resolution, traceroute, MTU probing, ICMP ping, and IPv4/IPv6 stack probing for network diagnostics.
HTTP3/QUIC Protocol Upgrade and Performance
Endpoint Upgrade Plan
XQUIC is Taobao’s in‑house IETF‑compliant QUIC library. By integrating XQUIC, TNET now supports the full HTTP3 stack while shielding upper layers from protocol differences. Clients first fetch a policy from AMDC, which may include both HTTP3 and HTTP2 entries. After UDP connectivity probing, an HTTP3 long‑link is established only if the network permits.
Upgrade Effects
Overall Upgrade Progress & Impact
Since the IPv4 HTTP3 rollout and the subsequent IPv6 QUIC migration, key scenarios such as recommendation, transaction, short video, and upload have achieved full coverage. A/B testing shows significant reductions in total latency and P99 latency, higher one‑second completion rates, and improved performance on weak networks.
Recommendation: total latency ↓22%/33% (P99), 1‑second completion ↑1.2 pts.
Transaction: total latency ↓23%/32%, 1‑second completion ↑0.55 pts.
Upload: video/image upload speed ↑7.7%/21%, success rate ↑0.18 pts.
Short‑video download: total latency ↓15%/16%, download speed ↑18%.
Typical Business Scenarios
Interaction Interrupt Rate
HTTP3 experiments reduced interaction interrupt UVs by 24.02% (Android) and 20.91% (iOS), with corresponding drops in lost UVs.
Cart & Detail Pages
After switching to HTTP3, average latency for cart and detail page APIs decreased noticeably, improving overall user experience.
Deployment Issues & Optimizations
UDP Penetration
Some carriers drop UDP packets, causing high downgrade rates. Taobao introduced asynchronous UDP connectivity probing during startup or network switches, caching results locally. Success rates improved from ~95% to ~98% after targeted mitigations.
UDP Port NET‑rebind
QUIC’s connection migration and multipath features rely on CID‑based routing rather than the 5‑tuple used by TCP. In mobile scenarios, NAT timeout can cause UDP port rebind issues, leading to connection resets. CID‑based load balancing resolves this problem.
0‑RTT Ratio Increase
By caching session tickets and transport parameters, 0‑RTT connections rose from 40% to 65%, boosting overall latency improvement from ~15% to ~20% compared with HTTP2.
Business Non‑Encryption Demand
For large video payloads, XQUIC can negotiate clear‑text transmission after the handshake, eliminating unnecessary encryption overhead.
XQUIC Stack Performance
Protocol‑stack optimizations yielded an 85.93% processing‑speed increase, outperforming nginx‑quic by 15.62%.
Overall Optimization Strategies
Choose high‑performance programming languages and enable compiler optimizations.
Reduce memory copies by moving data directly from application to transport layer.
Inline critical functions, use larger lookup tables (e.g., 64 KB Huffman tables), and apply branch prediction hints.
Minimize redundant frames and packet overhead to improve packet‑per‑frame efficiency.
Group‑Wide Link Testing Protocol Upgrade
Amazon Full‑Link Platform Upgrade
The testing platform was upgraded to support both HTTP2 and HTTP3, addressing UDP hash lookup performance and kernel‑level UDP loss issues. Kernel patches and socket options (e.g.,
setsockopt(s, SOL_UDP, 200, (const void *)&value, sizeof(int))) were applied.
Ongoing Work
HTTP3 Coverage for Image Domains
Recommendation, transaction, short‑video, and upload links are fully upgraded; image domain migration is in gradual rollout.
HTTP3 over MPQUIC Scale‑Out
MPQUIC has been deployed across client, SLB, and Aserver. Android clients are live, offering long‑tail compensation and multi‑path acceleration modes, with an additional 8% speed gain over single‑path QUIC. The XQUIC MPQUIC implementation is open‑source.
Appendix
QUIC‑LB: https://datatracker.ietf.org/doc/html/draft-ietf-quic-load-balancers-15 RFC 9000: https://quicwg.org/base-drafts/rfc9000.html RFC 9114: https://quicwg.org/base-drafts/rfc9114.html XQUIC: https://github.com/alibaba/xquic
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
