Inside Uber's Real‑Time Dispatch: How the Company Scales Its Marketplace
This article details Uber's rapid growth and the engineering choices behind its real‑time dispatch platform, covering geospatial indexing, microservice architecture, scaling techniques like Ringpop and TChannel, and strategies for availability and fault tolerance.
Statistics
Uber's geospatial index aims for one million writes per second, with reads much faster.
The dispatch system runs on thousands of nodes.
Platform
Node.js, Python, Java, Go, iOS and Android native apps.
Microservices, Redis, Postgres, MySQL, Riak, Twemproxy, Google S2 library, Ringpop, TChannel, Thrift.
Overview
Uber connects passengers and drivers; the core challenge is real‑time matching of dynamic supply and demand.
The dispatch system is a real‑time market platform that uses mobile phones for communication.
New Year’s Eve is the busiest period of the year.
Architecture Overview
Clients run native apps; the backend serves mobile traffic over the public internet.
Dispatch (DISCO) coordinates drivers and riders, built mainly in Node.js.
Scaling relies on a consistent‑hash ring and a gossip protocol (Ringpop).
Geospatial Index
Designed for one million writes per second from driver updates every four seconds; reads must be significantly faster.
Uses Google S2 library to partition the earth into level‑12 cells, each identified by a 64‑bit ID.
Cell IDs serve as partition keys for location updates and queries.
Routing and Matching (DISCO)
DISCO matches supply and demand using geo‑supply and geo‑demand indexes.
Considers ETA, empty‑drive reduction, and overall wait time.
Supports future planning, route changes, and multi‑product requests such as rides, parcels, and food delivery.
Scalable Dispatch
Node.js processes are stateful; scaling is achieved with Ringpop’s gossip‑based consistent‑hash ring.
TChannel provides high‑performance RPC, about twenty times faster than HTTP, and supports Thrift.
Availability and Failure Handling
All operations are idempotent and retryable; services are partitioned into small, replaceable units.
Data‑center failover uses driver phones as the source of travel state, allowing seamless continuation after a switch.
Limitations
Node.js fan‑out can cause high latency under heavy load.
Cross‑server request cancellation in TChannel mitigates latency spikes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
