Industry Insights 15 min read

Gaode’s Unit‑Based Architecture: Scaling Services with Smart Routing and Data Sync

This article details Gaode's practical experience in building a unit‑based service architecture, covering challenges like request routing, unit isolation, and data synchronization, and explains the design choices, deployment strategies, performance metrics, and future optimization plans.

Amap Tech
Amap Tech
Amap Tech
Gaode’s Unit‑Based Architecture: Scaling Services with Smart Routing and Data Sync

Why Unitization?

As Gaode's user base and service volume grew, a single data center or even a pair of co‑located centers could no longer support continuous expansion, and multi‑region disaster recovery became a core requirement, prompting the need for a unit‑based redesign.

Gaode’s Unitization Characteristics

Gaode provides navigation‑related travel services with stringent response‑time (RT) demands, so the unitization design aims to keep data close to users and minimize impact on overall service latency. Two key requirements are:

Users should be routed to the nearest unit based on their real geographic location (e.g., North‑China users to the Zhangbei unit, South‑China users to the Shenzhen unit).

The unit assigned to a user should align with the nearest unit to avoid cross‑unit routing.

Because many Gaode services are login‑free, the unitization solution must support both user‑ID and device‑ID based routing.

Practical Implementation

The unit‑based transformation addresses three core problems: request routing, unit isolation, and data synchronization.

Request Routing : Gaode uses two strategies—mod‑based routing and routing‑table routing. The routing‑table approach is currently the most widely deployed.

Unit Isolation : Leveraging the group's infrastructure (vipserver, HSF), services are confined to the same data center, achieving unit isolation. The HSF unit mode is also feasible but the same‑data‑center architecture is simpler to maintain.

Data Synchronization : Gaode relies on the group's DB product DRC for data replication across units.

Three deployment options for the unit routing service were evaluated; the SDK‑based approach was rejected due to high intrusiveness. Instead, a decentralized plugin integration method was chosen, embedding UnitRouter into the application’s Nginx.

Routing Strategies

Gaode’s services (e.g., account system, cloud sync, user comments) have been unitized across three regions and four data centers, with peak write QPS reaching tens of thousands. The account system stores data in Tair cache and XDB, with full synchronization across regions. UnitRouter on Tengine identifies the appropriate unit via a routing table that maps users to units.

To achieve low latency, two key links are optimized: external access through aserver for geographic segmentation, and internal service routing to avoid cross‑unit hops.

Two measures are applied:

Clients are assigned to the nearest unit via aserver configuration (e.g., North‑China users to Zhangbei).

User‑unit relationships are recorded in a routing table derived from log analysis, ensuring request entry points align with unit assignment.

Both mod‑based and routing‑table‑based strategies support user‑ID and device‑ID routing.

Routing Table Design

The routing table consists of two mappings: user‑to‑group and group‑to‑unit. Groups correspond to the seven mainland China regions. The user‑group mapping is periodically refreshed from logs, assigning users to groups based on recent IP location; the group‑unit mapping provides a default geographic mapping (e.g., South‑China → Shenzhen) plus additional entries for traffic‑shaping scenarios.

New users, lacking a routing‑table entry, fall back to the mod‑based strategy until the next table update.

Routing Computation

Performance, space, flexibility, accuracy, and stability are balanced. External storage introduces risk, so an in‑memory Bloom filter is chosen to filter groups, accepting a tiny false‑positive rate (≈0.03 %).

Two special cases arise:

Bloom filter false positives may assign a user to a neighboring region; the impact is negligible and acceptable.

New users not present in the group data trigger the fallback mod‑based routing.

If the service uses mod routing directly, the calculation is straightforward (tid or uid modulo division).

Unit Cutover Process

Open unit write lock (optional for non‑sensitive writes).

Check service latency.

Execute the switch plan.

Release the write lock.

Updating the routing table follows the same steps, with the switch plan replaced by loading a new routing version.

Key Metrics

Unit calculation latency: 1–2 ms.

Cross‑unit routing ratio: below 5 % (typically around 3 %).

Future Optimizations

Unified Gateway Integration : Most services now use a unified gateway; embedding unitization capabilities there will reduce deployment cost and let teams focus on isolation and data sync.

Group Mechanism Improvements :

Determine a user’s unit from actual entry logs instead of IP‑derived regions.

Split each unit into four virtual sub‑groups for finer‑grained traffic shifting.

Use a single mod calculation to map a user to a virtual subgroup, reducing computation for new users.

Dual‑Table Routing During Hot Updates : Instead of a full write lock, load both old and new routing tables simultaneously; only requests whose unit changes are blocked until the new table is fully active, dramatically reducing downtime.

Data‑Driven Unitization : Beyond user/device dimensions, future designs will consider data‑domain driven routing, especially for map data, 5G, and autonomous driving workloads, where data growth outpaces user‑centric traffic.

Final Thoughts : Different scenarios require different unitization strategies—mod routing for simplicity and maintainability, routing‑table strategy for RT‑sensitive services. Strongly coupled services should share the same routing approach.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backendhigh availabilityroutingdata synchronizationService Architectureunitization
Amap Tech
Written by

Amap Tech

Official Amap technology account showcasing all of Amap's technical innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.