How Alibaba’s Dual-Path Real-Time Computing Powers Search During Double 11

This article explains Alibaba’s dual‑link real‑time computing framework, detailing its micro‑ and macro‑level pipelines, key components such as Pora, iGraph and SP, online learning architectures, pointwise and pairwise ranking models, bandit‑based strategy optimization, PID‑controlled traffic balancing, and the impressive performance gains achieved during the Double 11 shopping festival.

21CTO
21CTO
21CTO
How Alibaba’s Dual-Path Real-Time Computing Powers Search During Double 11

0. Introduction

What is a dual‑link real‑time computing system? It consists of a microscopic real‑time link that processes the finest‑grained item, shop, and user data and underlying models, and a macroscopic real‑time link that operates on coarser‑grained objects to optimize strategies using bandit learning and PID‑based traffic control.

1. Search Real-Time Computing System

1.1 System Architecture

1.2 Important Components

Pora

Pora is a real‑time computation and online learning system built on Alibaba’s iStream engine (running on Hadoop YARN) and HBase. It processes massive user behavior and item data within seconds, extracts multi‑dimensional features, and uses a distributed Parameter Server for online learning, enabling real‑time personalization, anti‑fraud, and traffic optimization. During Double 11 it ran 24/7, handling 1300 billion messages with a peak QPS of 5 million.

iGraph

iGraph provides large‑scale KV/KKV storage, query, update, and computation services for real‑time online graph data, supporting many business lines including personalized search and recommendation. During Double 11 it reached a peak QPS of 2.45 million and supported high‑frequency updates.

SP (Search Planner)

SP is a unified service interface that creates query plans across back‑end systems (QP, iGraph, ISearch5) and returns results to the front‑end, simplifying search calls and improving front‑end performance.

ISearch5 Engine

ISearch5 is the latest search engine platform serving Taobao, Tmall, B2B, etc., supporting second‑level real‑time indexing and data updates.

Real‑time Reporting System

Based on the Galaxy platform, it aggregates multi‑dimensional business metrics (exposures, clicks, conversions, etc.) at minute‑level granularity, providing fast feedback for algorithms, products, and operations during promotions.

BtsServer

BtsServer manages bucket tests, enabling real‑time strategy optimization and traffic control by leveraging the real‑time reporting data.

2. Online Learning

2.1 Online Learning Framework

Modules:

Sample Worker – generates training samples from logs and fetches latest model features.

FeatureHQ – aggregates gradients of identical features.

Feature Worker – receives gradients and updates the model.

HBase – stores model parameters (weights, FTRL statistics, factorization vectors).

Characteristics: asynchronous, parallel, and platform‑wide, allowing developers to implement custom online learning algorithms via CalcSample, CalcGradient, and CalcWeight interfaces.

2.2 Pointwise Model

2.2.1 LR/FTRL

Logistic Regression (LR) and its variant FTRL are the first online learning algorithms deployed. Experiments show that asynchronous training converges with accuracy comparable to synchronous training. AUC comparisons on a CTR prediction task indicate only a slight gap between online asynchronous and offline synchronous training, both far outperforming long‑term batch models.

Model stability is addressed by averaging model weights over recent rounds or using moving averages to smooth rapid changes.

Hotspot features (e.g., bias term) are mitigated by local gradient aggregation in the Sample Worker, reducing data sent to FeatureHQ.

2.2.2 Online AUC Optimization

One‑Pass AUC optimization (based on Wei Gao et al., AIJ 2014) is implemented to directly maximize AUC, an NP‑hard objective, using an efficient approximation algorithm.

2.3 Pairwise Model

Pairwise learning directly optimizes the ordering of items, avoiding unnecessary constraints of pointwise loss and being robust to uneven log distributions.

2.3.1 Real‑time Matrix Factorization for Ranking with Side Information

The model learns user vectors, item vectors, and item bias terms. Training triples (user, preferred item, less‑preferred item) are extracted from logs, and a hinge‑loss with Laplacian regularization enforces similarity between items frequently co‑purchased.

2.3.2 Real‑time Bayesian Personalized Bilinear Model

A Bayesian bilinear model incorporates static and dynamic item descriptors and user features, estimating preference via a bilinear projection and MAP inference.

2.3.3 Experiments and Results

Real‑time personalized algorithms improve search NDCG and MRR metrics by over 20 % compared to initial models, demonstrating convergence of both matrix factorization and bilinear models during a day of traffic.

3. Macro Real‑Time (Strategy Optimization & Traffic Balancing)

3.1 System Architecture

(Architecture diagram omitted for brevity.)

3.2 Real‑time Strategy Optimization

Traditional LTR suffers from feature‑score drift and offline‑online gaps. A two‑stage approach combines Multi‑Armed Bandit (MAB) to select the best discrete strategy from a candidate set, followed by Zero‑Order Optimization to fine‑tune continuous parameters. The process continuously pushes the current best strategy to the acceptance bucket.

3.2.1 Algorithm Flow

(Flow diagram omitted.)

3.2.2 Multi‑Armed Bandit

The bandit selects among N strategies, updating selection probabilities based on real‑time reward g(i, Yₜ). After convergence, extra‑gradient optimization refines the strategy in continuous space.

3.3 Real‑time Traffic Control

3.3.1 Keyword Red Packet Traffic Control

PID controllers are used to regulate the issuance speed of keyword red packets during Double 11, ensuring smooth delivery while meeting contractual UV targets.

3.3.2 Search Traffic Balancing

PID control is also applied to balance traffic among different platforms (e.g., marketplace vs. Tmall) and seller tiers, aiming for long‑term platform health rather than short‑term transaction maximization.

4. Double 11 Production Impact

The real‑time computing system delivered significant gains during the Double 11 shopping festival:

PC/Handheld search: 11 % uplift in pre‑heat conversion, 8 % on the day.

Tmall search: 7 % day‑of conversion increase.

In‑shop search: 3.4 % day‑of conversion increase.

Total incremental GMV increase exceeding 20 billion CNY.

5. Conclusion

The dual‑link real‑time computing framework proved essential for search performance during Double 11, with continued expansion across business lines. Ongoing development promises further breakthroughs in real‑time personalization and optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Alibabasearch rankingOnline LearningReal‑Time ComputingPID controlbandit optimization
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.