APK Size Analysis by Library Dimension Using Gradle Intermediate Files
This article describes a method for analyzing Android APK size at the library level by extracting and parsing Gradle intermediate merge files, mapping resources, assets, native libraries and Java resources to their originating libraries, and linking them to maintenance teams for precise package‑size reporting.
Background : To reduce the size of the Beike Android app, the team needed a way to attribute each file in the APK to the library that produced it, which existing tools like Matrix could not provide.
APK file structure : An APK is a zip containing assets , res , lib , .dex , resources.arsc and other META‑INFO files.
Intermediate files : During the build, Gradle generates merge mapping files under app/build/intermediates/incremental for resources, assets, native libraries and Java resources. The article lists the exact paths for each type in a table.
Collecting intermediate files : A Gradle script is injected into app/build.gradle to capture the absolute paths of these files using the Android plugin API and write them to merge_files.txt . The script is shown below:
def extension_merge_state = project.extensions.getByName("android") extension_merge_state.applicationVariants.all { variant -> def variant_name = variant.name // 1. res merge file def mergeResourcesTask = variant.getMergeResources() mergeResourcesTask.doLast { def mergeResFile = mergeResourcesTask.incrementalFolder.absolutePath + "/merger.xml" println(mergeResFile) appendFilePath("mergeResources.xml", mergeResFile) } // 2. assets merge file def mergeAssetsTask = variant.getMergeAssets() mergeAssetsTask.doLast { def assertMergerFile = mergeAssetsTask.incrementalFolder.absolutePath + "/merger.xml" println(assertMergerFile) appendFilePath("mergeAssets.xml", assertMergerFile) } // 3. native lib merge state def container = variant.variantData.taskManager.taskFactory.taskContainer Task mergeNativeLibsTask = container.getByName("merge${variant_name.capitalize()}NativeLibs") mergeNativeLibsTask.doLast { def cache_merge_state = mergeNativeLibsTask.cacheDir.parent + "/merge-state" println(cache_merge_state) appendFilePath("mergeNativeLibs_merge_state", cache_merge_state) } // 4. java resource merge state Task mergeJavaResourceTask = container.getByName("merge${variant_name.capitalize()}JavaResource") mergeJavaResourceTask.doLast { def cache_merge_state = mergeJavaResourceTask.cacheDir.parent + "/merge-state" println(cache_merge_state) appendFilePath("mergeJavaRes_merge_state", cache_merge_state) } } void appendFilePath(file_key, file_path) { String input_dir = project.buildDir.toPath().toString() + "/merge_state" File dir = new File(input_dir) if (!dir.exists()) { dir.mkdir() } File inputFile = new File(dir.absolutePath, "merge_files.txt") if (!inputFile.exists()) { inputFile.createNewFile() } inputFile.append("${file_key}:${file_path}\n") }
After the build, these paths are uploaded to a Maven repository via a Python script for later analysis.
Parsing merge‑state files : The merge-state files are Java serialized objects of class com.android.builder.merge.IncrementalFileMergerState . A small Java utility MergeStateParser deserializes the object and writes it as JSON:
public class MergeStateParser { public static void main(String[] args) { parseObject(args[0], args[1]); } public static void parseObject(String mergeStatePath, String outputJsonPath) { ObjectInputStream ois = null; try { ois = new ObjectInputStream(new FileInputStream(mergeStatePath)); IncrementalFileMergerState merge_state = (IncrementalFileMergerState) ois.readObject(); String json = new Gson().toJson(merge_state); FileWriter fw = new FileWriter(outputJsonPath); fw.write(json); fw.flush(); fw.close(); } catch (Exception ex) { ex.printStackTrace(); } finally { try { if (ois != null) ois.close(); } catch (IOException e) { e.printStackTrace(); } } } }
The project must depend on com.google.guava:guava:27.0.1-jre to compile this parser.
Interpreting the data : The merger.xml files list each library (identified by group_id:artifact_id:version ) and the files it contributed. The byInput map inside the deserialized state links library artifact paths to the set of files. Example snippets of the XML and the Java fields are shown in the original article.
Mapping libraries to owners : Using the extracted group_id:artifact_id information, the team registers each library in the internal KeOnes CI system, linking it to the responsible maintenance team. Paths to the actual JAR/AAR files in the Gradle cache are also recorded (e.g., ~/.gradle/caches/transforms-2/files-2.1/.../jetified-lib_castscreen-1.1.1/jars/classes.jar ).
Application in package‑size analysis : With the library‑to‑file mapping and the owner registry, the system can compute the size contribution of each team’s libraries, display per‑team percentages, and identify large files. Screenshots in the article illustrate the dashboard.
Pitfalls :
Only artifact_id and version are present in merge-state ; the team resolves the missing group_id by matching against the full runtime dependency list.
Libraries placed directly in app/lib lack Maven coordinates; the recommendation is to publish them to Maven and assign coordinates manually.
Dex files cannot be split by library, but class and method counts per library can still be derived.
Older Gradle versions may not generate the intermediate files.
Conclusion : By extracting and parsing Gradle’s intermediate merge files, and maintaining a registry of library ownership, the team achieves fine‑grained APK size analysis, enabling accurate responsibility allocation and supporting downstream use cases such as crash routing and dependency checks.
Beike Product & Technology
As Beike's official product and technology account, we are committed to building a platform for sharing Beike's product and technology insights, targeting internet/O2O developers and product professionals. We share high-quality original articles, tech salon events, and recruitment information weekly. Welcome to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.