TrafficRoute GTM: GEO‑Based Routing and Traffic Orchestration at ByteDance
This article explains how ByteDance’s TrafficRoute GTM, a DNS‑based global traffic routing service, uses GEO‑based routing, health‑check orchestration, and intelligent load‑balancing to achieve high stability, performance, and cost efficiency for ultra‑large‑scale traffic across multiple regions and CDN providers.
Abstract: At ByteDance, balancing stability, performance, and cost for massive traffic is a shared challenge; TrafficRoute GTM plays a crucial role. It is a DNS‑based traffic routing service handling billions of requests and supporting large‑scale scenarios. This first part introduces the GEO‑based routing mode and its benefits.
1. Volcano Engine TrafficRoute GTM Overview
TrafficRoute GTM is a DNS‑based traffic routing service backed by over 1,100 distributed probing nodes worldwide. It perceives end‑to‑edge‑cloud link quality and dynamically schedules traffic based on real‑time access quality, node load, and health status.
In addition to flexible scheduling strategies, the GEO‑basic routing offers load balancing, session stickiness, and failover, while the Perf‑intelligent routing adds performance‑first and load‑feedback capabilities.
2. GEO‑Basic Routing: Custom Traffic Orchestration
2.1 Address Resource Orchestration
Resources (e.g., public‑cloud EIPs, CDN CNAMEs, edge access points) can be classified, combined, and orchestrated into address pools, which are referenced by routing rules to create customized traffic dispatch and disaster‑recovery solutions.
2.2 Health‑Check Orchestration
TrafficRoute GTM provides global L3/L4/L7 health checks with configurable sensitivity, enabling minute‑level automatic disaster recovery based on precise health data.
2.3 Routing‑Rule Orchestration
By configuring routing rules, users can precisely control traffic sources and destinations, and ensure automatic failover according to predefined disaster‑recovery plans.
3. ByteDance Internal Traffic Orchestration Practices
Using TrafficRoute GTM, ByteDance has implemented classic architectures such as same‑city multi‑active, cross‑region disaster recovery, global CDN scheduling, and CDN origin pull scheduling, achieving the following benefits:
Stability: MTTR reduced to minutes; 90%+ traffic convergence within 3‑5 minutes.
Performance: End‑to‑end latency reduced by over 15% by directing users to optimal nodes.
Cost: Bandwidth cost lowered by more than 10% by preferring lower‑cost resources.
3.1 Same‑City Multi‑Active & Cross‑Region Disaster Recovery
GEO‑basic routing enables AZ‑level load balancing, region‑level disaster recovery, client‑side GEO & ISP proximity access, and minute‑level automatic disaster recovery, ensuring continuous service.
3.2 Global Multi‑CDN Scheduling
TrafficRoute GTM allows dynamic selection of the most suitable CDN provider per region, balancing coverage, performance, and cost while ensuring proximity access for each ISP.
3.3 CDN Origin Pull Scheduling
By encapsulating origin endpoints into GTM‑managed origin domains, ByteDance gains load balancing, health‑check, and minute‑level automatic disaster recovery for the origin‑pull path, improving origin availability.
Overall, through same‑city multi‑active, cross‑region disaster recovery, global CDN scheduling, and CDN origin pull orchestration, TrafficRoute GTM enables ByteDance’s services to withstand ultra‑large‑scale traffic while maintaining stability, performance, and cost efficiency.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.