How Ele.me Scaled Its Architecture from All‑in‑One to Cloud‑Ready: Key Lessons
This article chronicles Ele.me’s evolution from a monolithic All‑in‑One system to a cloud‑ready, service‑oriented architecture, highlighting the cultural, operational, and technical lessons learned across four major phases of rapid growth and scaling.
As an internet startup, Ele.me grew tenfold in business volume and engineering team size, a trajectory that mirrors many tech startups.
Ele.me’s technical system evolved through four distinct stages:
Early All‑in‑One core system architecture.
Domain‑driven service‑oriented architecture separating business systems and middleware.
Infrastructure transformation toward DevOps with automation platforms and container orchestration.
Cloud‑Ready architecture built on a multi‑data‑center foundation.
Stage 1: All‑in‑One
The founding engineers built a single codebase (named zeus ) in Python (with some PHP) that supported millions of orders, handling merchant, user, and transaction services together. Rapid growth soon outpaced the database, prompting the creation of a big‑data warehouse and separate data centers in Beijing and Shanghai.
During this period, engineers wore many hats—frontend, backend, testing, and operations—gaining deep product knowledge and fostering a culture of responsibility, openness, and rapid problem solving.
Stage 2: Refactoring and Infrastructure
To improve delivery efficiency, the monolithic system was split into domain‑specific services, and the organization was restructured accordingly. New technology stacks (Java alongside Python) emerged as talent markets shifted.
Key challenges included:
Insufficient Java talent compared to Python.
Entangled code across business domains hindering delivery.
Engineers juggling both infrastructure and application development.
Manual deployment and monitoring processes causing slow incident recovery.
Domain boundaries were clarified, but over‑splitting sometimes led to low cohesion and increased complexity.
Lessons: Conway’s Law and Technical Culture
Repeated boundary disputes highlighted the need to respect Conway’s Law—system design mirrors organizational structure. A strong engineering culture emphasizing resource efficiency, source‑code reading, and rapid consensus‑building proved essential.
Stage 3: DevOps and Monitoring
Automation platforms and container scheduling matured, shifting governance from traditional ops to DevOps. Monitoring moved from manual log checks to telemetry‑based systems (ELK stack, 24/7 NOC). Core metrics were categorized into business, application, and system indicators.
A notable incident involved an int32 overflow silently swallowed by code, underscoring the importance of exposing true error signals in metrics.
Stage 4: Cloud‑Ready Architecture
With multiple data centers, the architecture became cloud‑ready, supporting large‑scale containerized deployments and virtualized environments.
Operational Teams and Practices
Specialized teams were formed for software delivery, system operations, DBA, and stability assurance. Monitoring dashboards now track business health, application performance, and system resources, enabling faster root‑cause analysis.
Key Takeaways for Architects
Architects must design and evolve technical solutions, define non‑functional requirements, manage domain boundaries, and guide long‑term evolution. Their role extends beyond design to influencing culture, ensuring stability, and participating throughout the product lifecycle.
Author: Huang Xiaolu (alias Maikun), joined Ele.me in October 2015, responsible for global architecture.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
