Operations 14 min read

eBay’s Scalability Best Practices: Functional Partitioning, Horizontal Sharding, Avoiding Distributed Transactions, Asynchronous Decoupling, Caching, and Virtualization

The article outlines eBay’s key scalability best practices—including functional partitioning, horizontal sharding, eliminating distributed transactions, aggressive asynchronous decoupling, intelligent caching, and pervasive virtualization—to illustrate how large‑scale web systems can achieve linear or sub‑linear growth while maintaining availability and performance.

Architects Research Society
Architects Research Society
Architects Research Society
eBay’s Scalability Best Practices: Functional Partitioning, Horizontal Sharding, Avoiding Distributed Transactions, Asynchronous Decoupling, Caching, and Virtualization

At eBay, scalability is a core architectural driver, influencing every design decision to support billions of users, over 20 billion page views per day, and petabytes of data.

Best Practice #1: Functional Partitioning

Group related features into the same service pool and isolate unrelated features, allowing independent scaling; eBay organizes ~16,000 application servers into 220 pools and separates databases by data type across 1,000 logical databases on 400 physical hosts.

Best Practice #2: Horizontal Sharding

Break workloads into manageable units that can be scaled independently; stateless application servers are load‑balanced, while data is sharded (e.g., user data across 20 hosts) and re‑sharded as volume grows.

Best Practice #3: Avoid Distributed Transactions

Reject two‑phase commit in favor of relaxed consistency; rely on eventual consistency techniques such as ordered DB operations, asynchronous event replay, and batch settlement, choosing availability and partition tolerance over strict ACID guarantees.

Best Practice #4: Asynchronous Decoupling

Prefer asynchronous communication (queues, multicast, batch jobs) so components can scale and fail independently; technologies like SEDA enable intra‑component async processing while preserving a simple programming model.

Best Practice #5: Move Processing to Asynchronous Flows

Shift non‑critical work (e.g., analytics, billing, reporting) to background pipelines, reducing request‑time latency and allowing infrastructure to be sized for average load rather than peak spikes.

Best Practice #6: Virtualize at All Levels

Apply virtualization and abstraction—from OS and VM layers to ORM, load balancers, and virtual IPs—to enable flexible rebalancing of logical hosts across physical machines without code changes.

Best Practice #7: Proper Caching

Use caching wisely to maximize hit rates within memory limits, balancing freshness, availability, and cost; cache slowly changing metadata aggressively while avoiding caching rapidly changing transactional data unless consistency requirements permit.

Conclusion

Scalability is not a secondary non‑functional requirement but a prerequisite for functionality; the described practices help architects design systems that can grow efficiently and remain highly available.

operationsScalabilityshardingCachingasynchronous processing
Architects Research Society
Written by

Architects Research Society

A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.