Cracking Alibaba’s 10M Orders Interview: Architecture Seven‑Suite + Heterogeneous Storage Solution

The article dissects Alibaba’s second‑round interview question on handling 10 million daily order queries, exposing why a single sharding answer fails and presenting a comprehensive architecture‑seven‑suite combined with heterogeneous storage (MySQL, HBase, ClickHouse, ES, Redis, MQ) to achieve high concurrency, low latency, and reliable data consistency.

Tech Freedom Circle
Tech Freedom Circle
Tech Freedom Circle
Cracking Alibaba’s 10M Orders Interview: Architecture Seven‑Suite + Heterogeneous Storage Solution

In an Alibaba second‑round interview, a candidate answered the "10 million daily order query" scenario with only sharding, which led to failure. The core lesson is that architecture must be tailored to the specific scenario, covering traffic, service, cache, MQ, search, storage, and data‑consistency layers.

Overall Approach

The solution adopts a "seven‑suite architecture" plus heterogeneous storage (MySQL + CDC + HBase + ClickHouse). It builds a full‑link defensive system that integrates traffic control, service decoupling, caching, asynchronous messaging, search optimization, and layered storage.

1. Traffic Architecture Three‑Suite

Rate limiting : Apply per‑user/IP QPS thresholds (e.g., 100 k QPS for front‑end, 100 QPS for admin) using Sentinel and token‑bucket algorithms.

Degradation : When DB CPU > 85 % or HBase latency > 500 ms, switch to degraded responses (only core fields).

Scaling : Use Kubernetes HPA/KEDA for horizontal scaling; pre‑scale storage nodes before traffic spikes.

2. Service Architecture Three‑Suite

Decoupling : Split order service into order‑write and order‑query microservices (OpenFeign/gRPC/Dubbo3, Nacos discovery).

Asynchrony : Offload heavy tasks via CDC + MQ, ensuring the main transaction path remains fast.

Down‑streaming : Route online point‑queries to HBase, analytical queries to ClickHouse, reducing MySQL load.

3. Cache Architecture Three‑Suite

Penetration protection : Bloom filter covering all valid order_id from MySQL and HBase; dynamic updates for new orders.

Cache‑break protection : Cache empty results (TTL = 2 min for MySQL, 1 min for HBase/ClickHouse) to stop hot‑key storms.

Cache‑avalanche protection : Staggered expiration and distributed mutex (SETNX) for top‑hot items.

4. MQ Architecture Three‑Suite

Zero loss : Transactional messages (e.g., RocketMQ) guarantee atomic "order persist + message send"; enable persistence, clustering, and ACK after cache/ES sync.

Ordering : Partition by order_id for intra‑order ordering; global ordered queues for high‑value orders.

Idempotence : Unique idempotent keys (order_id + msg_type) stored in Redis with 24 h TTL; duplicate detection before processing.

5. Search Layer Four‑Suite (ES)

Elasticsearch handles real‑time aggregation and fuzzy queries, while ClickHouse takes over heavy batch analytics. The ES stack includes index optimization, query routing, storage, and Redis cache for hot results.

6. Storage Layer Four‑Suite

Cold‑hot separation :

Hot (last 30 days) → MySQL primary‑replica.

Warm (30‑180 days) → CDC to HBase for point queries.

Cold (> 180 days) → HBase compressed storage + ETL to ClickHouse for analytics.

Asynchronous dual‑write : MySQL writes trigger CDC + MQ to write simultaneously to HBase and ClickHouse, with periodic consistency checks.

Heterogeneous storage : MySQL for strong consistency, HBase for high‑throughput point reads, ClickHouse for column‑ariented batch analysis, ES for real‑time search, Redis for multi‑level caching.

7. Data Consistency Two‑Suite

Combine precise tactics (transactional messages, version checks) with scenario‑based consistency levels to balance performance and correctness.

Scenario‑Driven Design

Four typical scenarios illustrate how to combine suites:

Hotspot queries (TOP 100 merchants) : Multi‑level cache + HBase down‑stream + traffic limiting.

Aggregated analytics : ES indexing + ClickHouse batch processing.

Large‑scale scans (monthly finance reconciliation) : Route all scans to ClickHouse, use service down‑stream and traffic degradation.

Write‑heavy, read‑light IoT order streams : MySQL for transaction, CDC + MQ to HBase and ClickHouse, async pre‑compute for rare reads.

Interview Q&A Highlights

The article provides ready‑to‑use answers for common follow‑up questions, such as when to choose sharding vs. ES vs. ClickHouse, how to handle cache penetration, guaranteeing MQ zero‑loss, avoiding cold‑data archive issues, and why HBase is unsuitable for analytics while ClickHouse excels at batch queries.

Overall, the seven‑suite plus heterogeneous storage framework demonstrates a systematic, scalable, and interview‑ready architecture for the "10 million daily order" challenge.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed systemsbackend architecturemicroservicesHigh ConcurrencyInterview preparationdatabase scaling
Tech Freedom Circle
Written by

Tech Freedom Circle

Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.