How Ele.me Scaled to 9M Daily Orders: Architecture, Service Splitting & Ops
This article explains how Ele.me grew from a student startup to handling over nine million daily orders by evolving its website architecture, adopting SOA, splitting services, implementing a robust release system, and building comprehensive monitoring and data‑access layers.
Ele.me was founded in 2009 by university students Zhang Xuhau and classmates as a simple phone‑order delivery platform. After early success, they launched a website (ele.me) and attracted venture capital from Tencent and Alibaba, growing daily orders from tens of thousands to over nine million.
Overview
In the mobile‑internet era, apps have become the primary channel for user acquisition, and rapid growth forces front‑ and back‑end engineers to continuously evolve their architectures. High traffic and complex business logic require a scalable, service‑oriented approach.
1. Website Basic Architecture
The early site used an SOA framework to enable easy expansion and team collaboration. As order volume grew from thousands to millions, the team expanded from a handful of engineers to over 900, necessitating service division, centralized logging, and monitoring systems.
2. Service Splitting
Large repositories and monolithic services were broken into smaller repos and micro‑services, and common functionalities (payment, SMS, push) were extracted into independent services. Defining stable APIs early was crucial to avoid costly changes later.
3. Release System
Release windows were strictly controlled, and the lack of a simple rollback mechanism prompted the creation of a unified release system that enforces standardized rollback procedures for all services.
4. Service Framework
A distributed service framework provides registration, discovery, load balancing, routing, flow control, circuit breaking, and degradation. The platform supports multiple languages (Python, Java, Go) to accommodate diverse teams.
5. DAL (Data Access Layer)
Database bottlenecks were addressed by hardware upgrades, connection pooling, and rate limiting. The DAL layer reuses a small pool of connections across many processes, implements circuit breaking, and protects the database from overload.
6. Service Governance
Comprehensive monitoring captures request success, latency, and other metrics. An alerting system prioritizes critical signals, and efforts are underway to provide per‑machine metrics, intelligent alert filtering, and link‑level analysis to pinpoint failures.
Conclusion
There is no silver bullet in software architecture; Ele.me’s architecture continuously evolves to meet business demands. The key is building systems that are good enough today while allowing forward‑looking planning for future growth.
Author: Lan Jiangang, Ele.me Architect
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
