eBay’s Scalability Best Practices: Functional Partitioning, Horizontal Sharding, Avoiding Distributed Transactions, Asynchronous Decoupling, Caching, and Virtualization
The article outlines eBay’s key scalability best practices—including functional partitioning, horizontal sharding, eliminating distributed transactions, aggressive asynchronous decoupling, intelligent caching, and pervasive virtualization—to illustrate how large‑scale web systems can achieve linear or sub‑linear growth while maintaining availability and performance.
At eBay, scalability is a core architectural driver, influencing every design decision to support billions of users, over 20 billion page views per day, and petabytes of data.
Best Practice #1: Functional Partitioning
Group related features into the same service pool and isolate unrelated features, allowing independent scaling; eBay organizes ~16,000 application servers into 220 pools and separates databases by data type across 1,000 logical databases on 400 physical hosts.
Best Practice #2: Horizontal Sharding
Break workloads into manageable units that can be scaled independently; stateless application servers are load‑balanced, while data is sharded (e.g., user data across 20 hosts) and re‑sharded as volume grows.
Best Practice #3: Avoid Distributed Transactions
Reject two‑phase commit in favor of relaxed consistency; rely on eventual consistency techniques such as ordered DB operations, asynchronous event replay, and batch settlement, choosing availability and partition tolerance over strict ACID guarantees.
Best Practice #4: Asynchronous Decoupling
Prefer asynchronous communication (queues, multicast, batch jobs) so components can scale and fail independently; technologies like SEDA enable intra‑component async processing while preserving a simple programming model.
Best Practice #5: Move Processing to Asynchronous Flows
Shift non‑critical work (e.g., analytics, billing, reporting) to background pipelines, reducing request‑time latency and allowing infrastructure to be sized for average load rather than peak spikes.
Best Practice #6: Virtualize at All Levels
Apply virtualization and abstraction—from OS and VM layers to ORM, load balancers, and virtual IPs—to enable flexible rebalancing of logical hosts across physical machines without code changes.
Best Practice #7: Proper Caching
Use caching wisely to maximize hit rates within memory limits, balancing freshness, availability, and cost; cache slowly changing metadata aggressively while avoiding caching rapidly changing transactional data unless consistency requirements permit.
Conclusion
Scalability is not a secondary non‑functional requirement but a prerequisite for functionality; the described practices help architects design systems that can grow efficiently and remain highly available.
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
