Backend Development 5 min read

Inside Uber’s Complex Tech Stack: How They Scale Services Worldwide

This article breaks down Uber’s hybrid‑cloud infrastructure, storage choices, logging pipeline, service discovery, development languages, deployment tools, and monitoring system, revealing how the company builds a highly available, low‑latency platform that powers its global ride‑hailing service.

Java High-Performance Architecture

Jul 26, 2016

Inside Uber’s Complex Tech Stack: How They Scale Services Worldwide

Underlying Foundations

Uber runs on a hybrid‑cloud model using multiple cloud providers and data centers worldwide. If one data center fails, traffic is instantly shifted to another, and each city’s data is replicated to a remote site, ensuring continuous operation without a dedicated backup center.

Storage started with a single Postgres database, but growing demands led to higher availability and lower latency solutions. Uber now uses Schemaless (an internal MySQL‑based system) for long‑term storage, and Riak and Cassandra for high‑availability, low‑latency needs. Distributed storage and analytics rely on the Hadoop ecosystem. Caching is handled by Redis with Twemproxy, providing scalable cache clusters without sacrificing hit rates.

Logging

Logs are critical for troubleshooting and business analysis. They are fed into a Kafka cluster and consumed by Hadoop, file storage, real‑time processing services, etc. Log search and visualization are powered by the ELK stack (Elasticsearch, Logstash, Kibana).

Service Discovery and Routing

Uber adopts an SOA architecture. Service communication is managed with HAProxy and Uber’s open‑source Hyperbahn system, which simplifies discovery and routing for massive microservice environments. Older services use HAProxy to route HTTP/JSON requests, while newer services employ protocols such as SPDY, HTTP/2, and TChannel together with IDLs like Thrift and Protobuf to improve speed and reliability.

Development and Deployment

Primary languages are Python, Node.js, Go, and Java; early stages used Python and Node.js, later adding Java and Go for performance. Java benefits from a rich open‑source ecosystem (e.g., Hadoop), while Go offers efficiency and simplicity. System‑level components use C/C++ for maximum performance.

Tools such as Phabricator (code review, bug tracking, project management) and OpenGrok (code search) support development, while Sphinx generates documentation. Deployment integrates many open‑source tools: Packer (container image management), Vagrant (development environment), Boto (AWS API), Unison (file sync), Puppet (configuration management), and Jenkins (continuous integration).

Monitoring

Uber built a Go‑based metrics collection system that gathers data from servers, services, and code. Collected metrics are analyzed for trends and visualized with Grafana dashboards. An anomaly‑detection tool compares current values against historical models to flag out‑of‑range measurements.

Conclusion

Uber’s technology stack is highly complex, combining numerous open‑source projects, internally developed systems, and several open‑sourced components of its own.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Uber tech stack

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.