Operations 13 min read

Alibaba Hema’s 7‑Layer Funnel & 23 Tactics for Ultra‑Fast Delivery Stability

The article outlines Alibaba’s Hema delivery platform’s end‑to‑end stability strategy, detailing a 7‑layer funnel review process, three core norms (development, architecture, stability), and 23 practical tactics—including core‑noncore isolation, proactive monitoring, fault prevention, rapid recovery, and service‑level controls—to ensure reliable 30‑minute deliveries despite complex logistics and external disruptions.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Hema’s 7‑Layer Funnel & 23 Tactics for Ultra‑Fast Delivery Stability

Background

Hema (盒马) is a large‑scale new‑retail platform that combines online and offline operations. Its delivery service promises 30‑minute door‑to‑door delivery within a 3 km radius, which requires a highly stable end‑to‑end system.

Three Core Norms

The technical department distilled its stability methodology into three “norms”: development norm, architecture norm, and stability norm.

7‑Layer Funnel Model

The 7‑layer funnel (PRD review → Technical solution review → TC review → Coding → Testing & Code Review → Gray‑release → Operations) filters out major faults before they reach the field.

7‑layer funnel model
7‑layer funnel model

Key Review Stages

PRD Review: Bi‑weekly demand pool screening, risk identification, and domain modeling.

Technical Solution Review: Cross‑team technical walkthrough and risk mitigation.

TC Review: Coverage, performance, testability, and release timing assessment.

Coding: Follow corporate coding standards, defensive programming, and high‑availability patterns (caching, retries, transactions, logging).

Testing & Code Review: Self‑test, smoke test, formal test, and code “online review”.

Gray Release: Controlled rollout per store, real‑time monitoring (SLS, A3, EagleEye, CloudDBA) and staged scaling.

Operations: Post‑release monitoring, rapid incident escalation, and coordinated response.

System Isolation & Service Design

More than 50 systems (20 core) are separated into core and non‑core services, with dedicated databases (MySQL for core, ADS for analytics, OpenSearch/ODPS for non‑core). Calls use HSF request/response and event‑driven messaging, with “carrier‑level” services to shield core functions from external failures.

Core‑noncore isolation diagram
Core‑noncore isolation diagram

Seven Practical Tactics

Core and non‑core isolation at application and database layers.

Timely problem detection via service‑level controls (idempotency, parameter checks, circuit breaking) and system‑level monitoring (traffic scheduling, red‑line enforcement, A3/EagleEye/SLS metrics).

Fault prevention through regular refactoring, timeout/retry mechanisms, and fault‑injection drills.

Fault mitigation with resource buffers, degradation plans, and fallback strategies for partner services.

Rapid recovery via targeted rollbacks, flexible availability, and one‑click repair tools.

Quick compensation using stateless, horizontally‑scaled services.

Release‑based “treatment” for unrecoverable issues, exemplified by a recent high‑load incident resolved by emergency deployment.

Performance Optimization Example

By converting a Cartesian‑product matching problem into a matrix computation, network overhead was reduced from 108 calls to 9, achieving a 12× performance gain.

Matrix optimization diagram
Matrix optimization diagram

Conclusion

Hema’s delivery stability relies on coordinated efforts across business, product, development, testing, web, app, RF, GOC, algorithms, IoT, NBF, security, middleware, network, weather, traffic, and rider equipment. Continuous learning and rigorous engineering practices keep the system resilient.

Monitoringarchitectureoperationsstabilitydelivery
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.