How Uber Scaled from Monolith to Service‑Oriented Architecture
Uber transitioned from a monolithic codebase to a service‑oriented architecture, detailing goals like 99.99% reliability, core vs optional code separation, DISCO dispatch optimization, real‑time location tracking with Kafka, Hadoop, Spark, and the supporting infrastructure such as load balancers, web sockets, and security layers.
Uber originally started as a monolithic architecture serving only San Francisco (UberBlack). As the core domain model grew and new features were added, component coupling became severe, making continuous integration and deployment burdensome, prompting a shift to a service‑oriented architecture.
Goal
Achieve 99.99% reliability for the core ride‑hailing experience (maximum one hour of downtime per year, one minute per week, i.e., one failure per 10,000 operations).
Split the codebase into core and optional parts. Core code handles passenger registration, ride requests, completions, or cancellations and requires strict review. Optional code is reviewed minimally and can be dynamically disabled, encouraging independent development and rapid feature experimentation.
Define core architecture: class names, inheritance relationships between business‑logic units, main business logic, plugin points (names, dependencies, structure), reactive programming chains, and unified platform components.
Solution
Adopt iOS architecture evolution (from MVC to VIPER) and create Riblets.
Features & Functional Requirements
Passengers can view nearby drivers.
Passengers can initiate a ride request.
Passengers can see estimated arrival time and price.
After a driver accepts, passengers can track the driver’s location and communicate throughout the trip.
Passengers can pre‑book a taxi.
Automatic matching of passengers and drivers.
Display nearby taxis on the map.
Location tracking.
Post‑ride actions: rating, email notifications, database updates, payment.
Dynamic pricing and incentives: when demand rises and supply falls, prices increase; incentives help balance supply by encouraging more drivers to come online.
Non‑Functional Requirements
Globalization.
Low latency.
High availability.
Strong consistency.
Scalability.
Data‑center failure handling: a backup data center can take over routing, though in‑flight trip data may lack backup.
DISCO – Uber Dispatch Optimization
DISCO matches supply (drivers) and demand (passengers) using precise location data. It minimizes total service time and driver travel time by leveraging Google’s S2 library to partition the map into small cells with unique IDs, enabling efficient storage and consistent hashing.
Implemented in Node.js, DISCO uses an event‑driven asynchronous model with WebSocket communication. A consistent‑hash ring distributes DISCO servers, and the SWIM/Gossip protocol detects node changes to rebalance load. Servers communicate via RPC.
Request Service
Passenger initiates a ride request.
System obtains the passenger’s request location.
Microservice receives the request via WebSocket.
Tracks passenger GPS.
Accepts specific passenger requirements.
Passes the request to the dispatch system to connect with supply services.
Supply Service
Provides services to drivers.
Tracks taxi locations using latitude/longitude.
Every 5 seconds, online taxis send their location to a load balancer via a web‑application firewall.
The load balancer forwards GPS data to a Kafka REST API.
Kafka updates the location in real time, replicating it to databases and DISCO for consumption by all services.
Data Flow
Taxi location data.
Post‑ride billing data with timestamps for fare calculation.
Database Architecture
Supports high‑frequency reads and writes.
Taxi positions are updated every 5 seconds, generating heavy write load; ride requests generate heavy read load.
Transitioned from relational PostgreSQL to a schema‑less NoSQL store built on MySQL.
System Architecture
System Components
Map – Sending Taxi Locations to Passengers
When a passenger requests a ride, the app displays nearby drivers on a map; the client queries the server for drivers in the vicinity.
Real‑time location data from Kafka is used to calculate driver ETA, informing the passenger of arrival time and destination ETA.
Dijkstra’s algorithm finds the shortest path on road networks; more advanced AI algorithms estimate travel time considering traffic.
Web Application Firewall
Security firewall blocks requests from suspicious sources or unsupported regions.
Load Balancing
Uber employs three layers of load balancers: L3 (IP‑based), L4 (DNS‑based), and L7 (application‑level).
Kafka
Kafka provides a log layer, instantly recording updates for consumption by multiple microservices, ensuring no data loss via a clustered Kafka deployment.
Web Sockets
Clients (passenger and driver apps) maintain long‑lived connections with servers via WebSockets for real‑time communication.
Hadoop
Uber archives Kafka streams into Hadoop for batch analysis, enabling insights such as driver density and ride request patterns.
MySQL‑Based Payment Database
Payment service, triggered by Kafka after a ride, calculates fare based on distance and time, inserts records into a MySQL database, and exposes APIs for account queries.
Supports payment options, pre‑authorizations, refunds, tip handling, scheduled rides, promotions, retries, currency conversion, and error handling.
Spark Streaming Cluster
Tracks events like driver shortage, dumps all events to Hadoop for deeper analysis (user segmentation, driver behavior, etc.).
Driver Profile Engine
Classifies drivers based on ratings, service punctuality, and other metrics.
Fraud Engine
Detects collusive behavior (e.g., same driver repeatedly serving the same passenger) and leverages data for traffic condition insights.
Kibana/Graphana – Elasticsearch
Log analysis, HTTP API tracing, configuration management, feedback collection, promotions, fraud detection, payment fraud, incentive abuse, and account theft monitoring.
Big Data
Big data solutions are essential for Uber’s continued evolution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
