How Baidu Revamped Its Search Display Service: From PHP to Go and Graph Engines
This article details Baidu's search display service evolution, outlining its original C and PHP/HHVM implementation, the three major challenges of high development difficulty, limited architecture capability, and low reusability, and presents a comprehensive solution using a graph execution engine, common operators, and a phased migration to Go.
Background
Baidu's search display service is responsible for fetching results from the retrieval system and then performing template selection, real‑time summary supplementation, data adaptation, and rendering to present rich, diverse results to users. Initially built in C, the service later migrated to PHP running on HHVM, but rapid product iteration exposed severe development inefficiencies. Today the service involves dozens of product lines and hundreds of engineers, supporting hundreds of fine‑grained display strategies, and faces increasing complexity due to generative AI models.
Key Challenges
High development difficulty : Complex process management and scattered strategy frameworks make simplification and extensibility hard.
Insufficient architecture capability : HHVM is no longer maintained, lacking async/multithread support and streaming, which hampers generative search and stability requirements.
Low reusability : General and vertical search share no reasonable architectural design, causing duplicated development across similar needs.
Solution Overview
The core ideas are to lower development difficulty by introducing an operator‑graph management engine, improve architecture by rewriting the service in Go using Baidu's internal Go Development Platform (GDP), and boost reusability through abstract common operators and shared libraries.
Infrastructure Components
GDP (Go Develop Platform) : An internal Go‑based development framework offering complete RPC server and client capabilities for API, web, and backend services.
ExGraph : Baidu's self‑built graph execution engine that drives the operator DAG.
Datahold : A data manager handling configuration, dictionaries, and other module data, supporting hot‑load, automatic registration, and remote data deployment.
Common lib : Shared libraries such as udai (remote data unified access) and Baidu’s own signature library, governed by unified admission standards to avoid duplicated effort.
Design Details
Operator Model
Graph : A simple description language that lets developers visualize the entire execution flow without external tools.
Operator : Minimal interface that hides implementation details; developers only need to implement the operator interface.
Execution : Supports serial groups, parallel groups, sub‑graphs, conditional operators, switch operators, interrupt, and wait mechanisms to accommodate complex business flows.
Efficiency tools : Code generators and scaffolding utilities accelerate application creation.
Migration Phases
Phase 1 – Architecture Graph Migration to Go
Rate‑limiting, parameter handling, search request, ad request, and HTTP‑header rendering are moved to Go using GDP + ExGraph, while the PHP‑based strategy service continues to handle business‑specific logic. This decouples the relatively stable architecture layer from the frequently changing strategy layer.
Phase 2 – Asynchronous Summary Migration
Async summary supplements real‑time data (e.g., video view counts, likes) that cannot be cached at minute‑level granularity. A side‑car bypass system is introduced to write raw async summary results directly to a bypass store, reducing remote communication overhead and avoiding double serialization between Go and PHP.
Phase 3 – Strategy Migration
Strategy groups 1, 2, 3 are migrated to the Go‑based service in a coordinated manner. Migration can be performed per‑strategy or per‑low‑traffic slice, and new strategies are built directly in Go.
Phase 4 – Full Migration
All async summary, rendering, and post‑processing logic are finally moved to Go. The remaining performance bottleneck is the PHP‑to‑Go serialization step, which would add latency if retained.
Stability Assurance
R&D Guarantees
Data governance, removal of legacy “fly‑line” code, and abstraction of common operators raise code quality. Unit tests, automated pipelines (data‑diff & UI‑diff), and log‑based analysis ensure safe migration.
Testing
Data Diff : Replay online requests and compare key data such as search and ad requests to detect regressions.
UI‑Diff : Pixel‑level diff for thousands of result templates, prioritizing based on traffic volume.
End‑to‑End : Combine automated checks with manual verification of landing‑page redirects, pagination, and ad effects.
Stability Tests : Long‑duration traffic pressure tests simulate production load.
Performance Tests : Flame‑graph analysis and QPS limit testing guide capacity planning.
Release Guarantees
Pre‑release resource and latency estimation, monitoring, alerting, degradation strategies, internal testing, and gradual rollout with small‑traffic exposure ensure a stable launch.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
