Scaling Parallel Development: Baidu App’s Journey to Component‑Based Architecture
Facing massive codebases, hundreds of developers, and rapid release cycles, Baidu App adopted a multi‑stage componentization strategy—spanning compile isolation, library standardization, runtime distribution, service layers, and degradation control—to improve parallel development, reduce complexity, and boost build speed and quality.
Componentization requires coordinated effort across architecture layers, build systems, dependency management, and anti‑degradation rules. The following describes Baidu App’s componentization journey, the technical challenges it addressed, and the concrete implementation steps that can be applied to other large‑scale applications.
Background and Goals
Sources of Complexity in a Large App
Business scale: >70 technical directions, >1.8 M lines of code per client. Goal – isolate component impact to prevent fault propagation and control overall complexity.
Team size: hundreds of developers with code permissions. Goal – guarantee efficient parallel development.
Internal integrations: >30 business modules with complex relationships. Goal – enable fast onboarding and reuse of core capabilities.
Rapid iteration: a new version every three weeks (2 weeks dev, 1 week test). Goal – avoid component degradation under fast releases.
Multiple technology stacks (H5, NA, Hybrid, Talos, Flutter). Goal – ensure basic capability reuse and build‑system support.
Different Product‑Technical Goals at Various Stages
2014 – reuse of third‑party libraries and single‑component output. Challenge: avoid pulling “mud” when extracting a component.
2017‑2019 – incubation of matrix products.
2018 – open‑source mini‑program reuse; components must be compatible with different hosts while keeping some dependencies replaceable.
Key Architecture Evolution
Initial Stage – 2013 (Fire‑starting)
All business and core logic lived in a single monolithic project without clear boundaries, making code comprehension and safe development difficult.
Main problems:
Base libraries and open‑source third‑party libraries were easily invaded by business code; no anti‑modification mechanism.
No container isolation for first‑screen modules, causing widespread impact.
Shared services (remote config, device capabilities) were not componentized, leading to tangled if/else logic.
Logic, resources, and data lacked clear ownership, making external component export difficult.
Plugin interfaces were fragile; the integrated business became a “super module” with uncontrolled dependencies.
2014‑2015 (Steam Engine Era)
Team size grew to a few dozen, and the need for external component export emerged.
Extracted third‑party libraries and coarse‑grained base libraries into lower‑level business components; both Baidu App and integrated businesses reused these base libraries.
Introduced a framework container to isolate first‑screen business and stack‑based navigation containers.
Developed new or refactored business components using a component model with clear ownership of logic, resources, data, and external dependencies.
Established a guideline prohibiting reverse dependencies (no tooling enforcement yet).
Component dependencies beyond the base library were injected via adapters.
Main problems:
Ambiguous component ownership; some components floated between base and business layers, and same‑level dependencies were unclear.
Adapter‑based one‑to‑one decoupling was explicit but inefficient.
Residual device‑capability interfaces remained in the main app alongside SDKs introduced via the plugin system.
2016‑2017 (Electric Era)
Focused on building the componentization framework (Pyramid, SchemeRouter) and distribution frameworks (RemoteConfig, PMS, pre‑fetch distribution), plus a data‑splitting framework (CocoaSetting). These ensured each component owned its logic and data.
2018‑2019 (Ideal – Nuclear Era)
Componentization matured; the main project became a thin shell with many shared services. Multiple repositories and the EasyBox build system (https://mp.weixin.qq.com/s?__biz=MzUxMzk2ODI1NQ==∣=2247483757&idx=1&sn=a63e61fc73beb4d3792ae20557e1897a) assembled the whole app from a central spec list.
Clear hierarchical ownership; shared services belong to either the base‑library layer or the business‑component layer, enabling bottom‑up external export.
App assembled from a central repository spec list via EasyBox.
Framework container loading and system event distribution unified in a lightweight AppLauncher.
SDKs are assigned to architecture layers; if only one business component uses an SDK, that component manages it, reducing external complexity.
Service layer sharing is well‑established.
Platformization (Interstellar Voyage)
The platformization wave adds cloud‑wide reuse. Combined with the shared component library and EasyBox, it enables matrix‑product output capabilities.
Implementation Path (Bottom‑up Componentization)
1. Compile Isolation, Layered Architecture, and Access Restrictions
Compile isolation: EasyBox defines per‑component interface files, making external dependencies explicit and preventing IDE‑driven accidental cross‑component access.
Layer‑level restriction: Lower‑level components cannot access higher‑level ones.
Same‑level restriction: Communication between same‑level components is mediated by the Pyramid framework, preserving clear interface boundaries.
2. Third‑Party Library Standardization and Base Library Systematization
Base libraries:
No anti‑modification mechanism – business code could intrude.
Cross‑dependency – the same base dependency’s logic scattered across components.
Solution: abstract base libraries to the architectural bottom layer, binary‑ize them, assign component owners, and enforce systematic building.
Third‑party libraries:
No anti‑modification mechanism.
Bug fixes often delayed in upstream repositories.
Solution: update all third‑party libraries to the latest released version, binary‑ize them, and apply runtime patches for divergent parts. Document these patches for external export.
3. Runtime Distribution and Isolation Services
To avoid centralized processing of shared logic and data, a container and distribution mechanism dispatches events, data, and logic calls.
Pyramid component framework:
Distributes system events to sub‑components.
Upgrades adapter decoupling from one‑to‑one to one‑to‑many, turning strong dependencies into weak ones and allowing replaceability of depended‑upon components.
Scheme handling:
Separates SchemeRouter (service‑layer component) from SchemeHandler (business component).
Scheme parameters are primarily used for H5 communication rather than page routing.
Configuration distribution service: Centralizes parsing and invocation of business logic; later evolved into a cloud‑control service.
Data splitting service: Works with the configuration service to keep data inside each component.
Resource / pre‑fetch distribution service: Provides resource and pre‑fetch distribution.
Framework container: Uses tab navigation and stack navigation containers to split UI data and events to child controllers.
4. Service Layer Construction
Low‑dependency components serving multiple businesses are abstracted into common services such as account, sharing, cloud‑control, analytics, performance, and AI.
5. Component Model Definition
Define a component model so each business module can be quickly componentized.
Guide each business module to clearly define functional scope, ensuring separate ownership of logic, resources, data, private SDKs, performance metrics, and compilation units.
Each component becomes an independent functional, logical, data/resource, H5‑communication, performance, and compilation unit (one or more).
Interface encapsulation and service‑binding layers can be split into different granularity compilation units to enable flexible composition and output.
6. Business Componentization
Following the component model, determine functional scope, logical boundaries, and interface contracts, then rapidly componentize business features.
7. Degradation Control
Component interface changes, dependency changes, and warning‑count variations are recorded and notified to owners via the Tekes platform. Without anti‑degradation mechanisms, bug‑fix speed cannot keep up with bug‑creation speed.
Benefits Summary
1. R&D Efficiency Gains
Complexity control: Complexity is confined within components, exposing a simple, dependable external interface.
Parallel development: Component frameworks and distribution services provide design‑time isolation; e.g., adding a remote‑config item dropped from >4 hours to ~0.5 hours (8× speedup).
Reuse: Enables matrix‑product wheel output; reuse rate exceeds 50 % across Baidu App’s matrix products.
Build speed: Independent compilation units allow source and artifact swapping, reducing average build time from 15 minutes to 2 minutes per run.
2. Quality Improvements
Design‑time isolation ensures that a fault in a single component remains contained, preventing whole‑app crashes.
3. Quantifiable Metrics for Startup Speed and Binary Size
Componentization provides concrete units for measuring and optimizing startup time and binary size.
4. Robust Architecture for Deep Optimization
A healthy component‑level architecture enables deep performance and resource optimizations.
References
Every Architect Should Study Conway’s Law: https://www.infoq.cn/article/every-architect-should-study-conway-law
3 Key Software Principles You Must Understand: https://code.tutsplus.com/tutorials/3-key-software-principles-you-must-understand--net-25161
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
