How Alibaba’s AI·OS Powers 10 Years of Search & Recommendation at Scale

Alibaba’s AI·OS, a decade‑old big‑data deep‑learning online serving platform, underpins the group’s search and recommendation services, delivering sub‑10‑second updates, supporting massive models, and integrating components like TPP, RTP, HA3, DII, and iGraph to drive efficient algorithm iteration and cloud‑scale innovation.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s AI·OS Powers 10 Years of Search & Recommendation at Scale

On September 28, Alibaba Search celebrated its tenth anniversary, marking a decade of a robust search and recommendation platform that supports Taobao, Tmall, Youku, and overseas e‑commerce, driving the majority of the group’s GMV. With the rise of the intelligent era, the platform has evolved into a big‑data deep‑learning online service system, maintaining a sub‑10‑second end‑to‑end update latency while enabling flexible splitting of deep‑learning networks, supporting multi‑terabyte models, heterogeneous and real‑time computing, and large‑scale training.

AI·OS Overview

AI·OS (Online Serving) is a ten‑year‑old big‑data deep‑learning online service framework built by Alibaba’s engineering, algorithm, and efficiency teams. It underpins all search and recommendation workloads across the Alibaba Group, serving e‑commerce, cloud, video, logistics, and more, and its cloud product matrix targets global developers with tens of millions of revenue.

Core Service Components

The system comprises five key service components:

TPP – Recommendation business platform

RTP – Deep‑learning prediction engine

HA3 – Search recall engine

DII – Recommendation recall engine

iGraph – Graph query engine

These components enable rapid composition and deployment of algorithmic flows via graph‑based operator pipelines, allowing online services to keep up with model training without lag.

Suez Framework and Hippo Scheduler

The Suez framework provides a unified abstraction for big‑data online services, guaranteeing second‑level data updates with strong consistency. It standardizes three dimensions: index storage (full‑text, graph, model), index management (full, incremental, real‑time), and service management (consistency, traffic shaping, scaling).

Hippo, the cluster resource scheduler, allocates mixed‑resource pools for training (PAI‑TF) and real‑time computation (Blink). At peak, it has run over 2,000 machines with thousands of CPU cores, delivering massive, free‑of‑charge compute capacity.

Integration with Blink and PAI

Blink, a general‑purpose real‑time computation engine, originated from AI·OS and now offers second‑level data updates with eventual consistency. PAI‑TF, after aligning with Hippo’s resource constraints, now handles all model training tasks for search and recommendation, and integrates with AI·OS’s graph execution engine.

Graph Computing and Future Directions

iGraph provides graph query capabilities, and the system’s graph‑based operator pipelines enable rapid experimentation. While classic offline graph computation is well studied, AI·OS pushes graph concepts into online services, demanding strict consistency and low‑latency updates.

Future plans include expanding Hippo’s boundaries (e.g., merging with Yarn), enhancing Suez’s capabilities, and delivering AI·OS‑based cloud products such as OpenSearch, ES, and the upcoming AIRec recommendation service.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIonline serving
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.