Cloud-Based Android Build System: Architecture, Optimization, and Distributed Compilation
Bilibili's cloud‑based Android build system leverages high‑performance remote servers, Dockerized build images, Git diff synchronization, and distributed compilation of heavy tasks like DexBuild to dramatically accelerate builds for its massive monorepo, providing parallelism and VIP priority while incurring a learning curve and added maintenance costs.
Background : Bilibili uses a monorepo for source code management, with over 620 Android sub-modules and more than 150 developers. Local development suffers from slow compilation, overheating, freezes, and blocking issues. Leveraging a large CI build cluster, a new cloud compilation approach was explored.
Principle : High‑performance servers enable remote compilation. Developers sync local changes with the remote repository via Git, generate a diff, and send a build request. The remote server compiles the code and returns an APK identical to a local build.
Early Stage : Docker‑based custom Android build images allow rapid provisioning of multiple build instances. Developers use a command‑line tool that calculates the commit, creates a diff, and submits a packaging request to the server.
Process Flow : The remote server receives the request, synchronizes code, applies patches, compiles, and returns the result. After a successful build, the local machine downloads the APK and installs it. This flow differs from local development where compilation occurs on the developer’s machine.
Advantages :
Correct environment
Parallel concurrency support
Accelerated compilation speed
Freeing local machines for logic development
Full source compilation (no cache mode)
Disadvantages :
Learning curve
Some tasks still require local compilation for code indexing
Local incremental compilation disabled, relying on remote cache
Increased maintenance cost for build machines
Random machine allocation can cause contention and waiting
Cache utilization not optimal, room for speed improvement
Continuous Optimization & VIP Mode : Initially 10 build instances served a small group. As usage grew, contention appeared, prompting a VIP mode for exclusive machine access or higher priority. Architecture changes added a second SLB layer and improved caching.
Optimization Measures :
Increase instance count and concurrency (e.g., 10C50G for single‑user, 30C100G for multi‑user)
Use network proxies or internal mirrors for Docker images, Android SDK/NDK, Gradle, Maven
Reduce instance restarts to preserve caches
Implement hot‑update mechanisms for services
Pre‑warm build services and code repositories
Preserve working directories per user/machine
Leverage Gradle Remote Cache
Intelligent scheduling based on user frequency
Network optimization (wired vs. Wi‑Fi)
Distributed Compilation : As module count grew, tasks like DexBuild and DexMerge became bottlenecks. The solution splits heavy tasks across idle machines. Example code shows hooking into com.android.builder.dexing.D8DexArchiveBuilder.convert to dispatch DexBuild to remote workers.
package com.android.builder.dexing;
// Partial code omitted for brevity
final class D8DexArchiveBuilder extends DexArchiveBuilder {
@Override
public void convert(@NonNull Stream<ClassFileEntry> input, @NonNull Path output, @Nullable DependencyGraphUpdater<File> desugarGraphUpdater) throws DexArchiveBuilderException {
D8DiagnosticsHandler d8DiagnosticsHandler = new InterceptingDiagnosticsHandler();
try {
D8Command.Builder builder = D8Command.builder(d8DiagnosticsHandler);
// ... configuration ...
D8.run(builder.build(), MoreExecutors.newDirectExecutorService());
} catch (Throwable e) {
throw getExceptionToRethrow(e, d8DiagnosticsHandler);
}
}
}A Kotlin hook replaces the original method to forward the work to a remote executor:
/**
* @see com.android.builder.dexing.D8DexArchiveBuilder.convert
*/
private fun hookBuilder() {
val dst = pool.get("com.android.builder.dexing.D8DexArchiveBuilder")
if (dst.isFrozen) {
log.error("clazz ${dst.simpleName} is frozen")
return
}
dst.getDeclaredMethod("convert").aopReplace(object : MethodInvokeCallback {
override fun invoke(self: Any, method: String, args: List
) {
XbuildDexBuilder().convert(self, args[0] as Stream<ClassFileEntry>, args[1] as Path, args[2] as DependencyGraphUpdater<File>?)
}
})
}Tests show that distributing DexBuild reduces task duration under normal load. Similar hooks can be applied to DexMerge, splitting large input sets (e.g., 1000 files) into groups and processing them remotely, achieving significant time savings.
Results :
Random mode: average build time ~5 min (cold 5‑10 min, hot 3‑7 min)
VIP mode: average build time ~3 min (cold 4‑8 min, hot 1.5‑4 min, best case 20 s)
Demonstration : Local build command ./gradlew :app:assembleDebug -q -s corresponds to cloud build hub -b ":app:assembleDebug -q -s" --vip . Screenshots illustrate build logs, instance lists, and real‑time logs.
Future Plans :
Cloud simulators and devices for testing
Cloud‑based IDEs and remote development environments
The presentation concludes with an invitation for feedback and interaction.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.