Operations 18 min read

How Alibaba Built a Scalable Search Middle Platform with DevOps Integration

Alibaba’s search middle platform illustrates a three‑year journey from manual, labor‑intensive operations to a fully integrated DevOps and AIOps ecosystem, detailing the evolution of SOPHON, Bahamut, and related systems that enable end‑to‑end automation, stability, and cost‑effective scaling for massive search workloads.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba Built a Scalable Search Middle Platform with DevOps Integration

Background

At the end of 2015 Alibaba launched a group‑wide middle‑platform strategy to create a "big middle platform, small front‑end" organizational model. The search middle platform faced world‑class challenges in both technology and product due to its complexity and scale.

DevOps Integration Journey

Initially, operations were manual and labor‑intensive, with human resources growing proportionally to business scale. Over time, repetitive tasks were automated with scripts, reducing cost but still separating development and operations roles.

To resolve the conflict between rapid development and stable operations, Alibaba adopted a DevOps‑in‑one approach, establishing a full‑chain OPS model that goes beyond single‑system management.

Target‑Driven Operations

Instead of process‑oriented workflows, the platform uses goal‑driven scheduling. When a rollout target changes (e.g., from index version B to C), the system instantly cancels the current path, cleans inconsistent states, and initiates the new target, simplifying complex operational steps.

Operation Concept Simplification

SOPHON abstracts low‑level operational concepts into data‑relationship models, then further into business‑level abstractions (logic plugins, service deployment, data sources). Users interact only with the business abstraction, shielding them from underlying complexity.

Stability Guarantees

Supports SLA for core services, automatic disaster‑recovery, and unit‑level isolation for both online and offline services.

Enables 24/7 release cycles with multi‑stage verification (daily, pre‑release environments, performance comparison, gray release, smoke tests) to ensure safe, rapid iterations.

Embedding Expert Experience

Operational expertise is encoded into DAG execution graphs. The platform decomposes complex tasks, executes them according to expert‑defined flows, and selects optimal execution paths, reducing user effort and improving iteration speed.

From System to Full‑Link

The platform coordinates online and offline components to provide an end‑to‑end experience. Users define data source relationships visually; the system translates them into executable Blink graphs for incremental sync, bulk load, and join tasks, ultimately feeding the search index.

Offline Component Platform – Bahamut

Bahamut abstracts heterogeneous data sources into dynamic tables, allowing users to define joins (e.g., ODPS ↔ MySQL) on a canvas. It translates the graph into Blink jobs (sync, bulk load, join) that produce intermediate HBase tables and downstream sinks for online indexing.

Conclusion

The three‑year evolution of Alibaba’s search middle platform demonstrates how integrated DevOps, goal‑driven operations, and AIOps can achieve scalable, reliable, and cost‑effective search services, while continuously embedding expert knowledge into the platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend EngineeringDevOpsaiopsSearch Platform
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.