Backend Development 9 min read

Business Storage Resource Optimization for Baidu Mobile Assistant: Process, Analysis, and Lessons

The author describes how Baidu Mobile Assistant’s massive storage costs were slashed by analyzing billions of objects, identifying low‑value incremental patch files, and applying a three‑step “elephant‑in‑the‑fridge” cleanup that reduced petabytes to a few hundred terabytes while highlighting the need for ROI‑driven update policies and ongoing documentation.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
Business Storage Resource Optimization for Baidu Mobile Assistant: Process, Analysis, and Lessons

This article summarizes the author's experience optimizing business storage resources for Baidu Mobile Assistant, focusing on incremental update patch packages and overall storage consumption.

Business overview : Baidu Mobile Assistant is an Android app distribution platform that stores developer‑uploaded APKs and incremental patch packages in Baidu Object Storage (BOS). Incremental updates generate patch files to reduce user download traffic.

Why optimize : Storage consumption reached several petabytes, costing tens of millions of RMB annually. Analysis of MySQL data showed that patch packages occupy a large share of storage, and a significant amount of storage remains unexplained.

Reasons for optimization include high budget consumption, the need to locate unknown storage, and the desire to clean up useless data.

Analysis : A sample of 300 k objects was taken from over a billion objects in BOS. Visualizations of storage distribution and update‑traffic distribution revealed that patch packages consume a disproportionate amount of space while delivering low traffic value, making them a prime target for cleanup.

Optimization ideas (three‑step “elephant‑in‑the‑fridge” approach) :

Reconstruct the full picture – review internal documents, read the code, interview relevant people, and map product features.

Design a concrete plan – export the full object list from BOS via API, segment objects by type (application binaries, patch packages, unknown objects), and define ROI‑based pruning rules (e.g., keep only the top‑300 updated apps, limit patch generation per version).

Execute the plan – implement verification and rollback mechanisms, then delete the identified waste. The result reduced storage from petabytes to a few hundred terabytes.

Key lessons : Storage waste often stems from missing cleanup logic and lack of documentation. Both product and engineering should define clear ROI for incremental updates. Regular monitoring, documentation, and review are essential to prevent storage bloat.

Conclusion : The author reflects on five questions (why waste occurs, who should consider it, what the original system lacked, how to avoid it, and how to do better) and emphasizes the importance of aligning product requirements with engineering cost‑benefit analysis.

Backend Developmentdata analysisStorage Optimizationcloud storageBOSincremental updates
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.