Mobile Development 16 min read

How We Built Tinker: Overcoming Android Hot‑Patch Challenges for High‑Availability

This article chronicles WeChat's journey developing the Tinker hot‑patch framework, detailing the technical hurdles of native vs. Java approaches, performance optimizations, platform‑specific issues, and the rigorous testing that enabled a high‑availability solution for billions of Android devices.

Tencent TDS Service
Tencent TDS Service
Tencent TDS Service
How We Built Tinker: Overcoming Android Hot‑Patch Challenges for High‑Availability

Hot Patch Technology Background

Hot patching is not a simple solution; it has its own limitations and requires thorough understanding before use.

Hot patches are not a free lunch.

We share WeChat's experience to help developers decide whether to adopt hot‑patch technology and which solution fits their projects.

Hot Patch Technology Background

Android hot‑patch technology can be divided into two streams:

Native – represented by Alibaba's Dexposed, AndFix, and Tencent's internal KKFix.

Java – represented by Qzone's SuperPatch, Dianping's Nuwa, Baidu Financial's rocooFix, Ele.me's amigo, and Meituan's robust.

Both streams have their own advantages and drawbacks; there is no universally best solution, only the most suitable one.

For WeChat, a "high‑availability" patch framework must satisfy three conditions:

Stability & Compatibility – WeChat runs on hundreds of millions of devices; even a 1% error rate would affect tens of thousands of users.

Performance – The framework must not degrade app performance, and patch packages should be as small as possible to save bandwidth and improve success rates.

Ease of Use – The framework should be simple to integrate and support full‑feature releases.

WeChat evaluated two existing solutions:

Dexposed/AndFix – faced stability and compatibility challenges, and native crashes are hard to diagnose. It also cannot add new classes, limiting feature‑level releases.

Qzone – suffered performance penalties on Dalvik due to instrumentation and produced large patch packages on ART because of address‑offset issues.

In March 2016, WeChat decided to build its own patch framework, Tinker. Tinker’s development proceeded in three stages, with v1.0 focusing on meeting performance requirements for a Dex‑based patch framework.

Tinker v1.0 – Pursuing Extreme Performance

We chose the Java stream for stability and compatibility. The main challenge was overcoming Qzone’s performance problem, which led us to study Instant Run’s cold‑swap and Buck’s exopackage concepts—both replace the entire Dex.

We use a new Dex while avoiding ART address‑misalignment and Dalvik instrumentation. To keep patch size small, we store only the differences between the old and new Dex. The methods we investigated include:

BsDiff – format‑agnostic but unstable for Dex and not ideal for our use case.

DexMerge – consumes excessive memory (a 12 MB Dex can peak at over 70 MB).

DexDiff – a custom algorithm that yields minimal diff size, low memory usage, and supports add/delete/modify operations.

Choosing DexDiff allowed us to achieve the required performance while keeping patch size minimal.

1. DexDiff Technical Practice

Deep analysis of the Dex format revealed three major difficulties:

Complex Dex structure – indexes (StringID, TypeID, etc.) and data sections use offsets that are heavily inter‑referenced; a tiny change can cascade into many index and offset modifications.

dex2opt and dex2oat verification – the system performs checks such as 4‑byte alignment and sorting of StringID, TypeID, etc.

Low‑memory, fast processing – each Dex block must be read and written only once; we cannot afford full‑structure approaches like baksmali or DexMerge.

For the Index region, the diff algorithm generates the smallest operation sequence to transform the old sequence into the new one, e.g.:

Del 2 – delete element "b" at index 2, storing only the index to reduce patch size.

Elements "c", "d", "e" shift forward automatically, no operation needed.

Addf(5) – insert element "f" at position 5.

For the Offset region, many sections exist, making it more complex. DexDiff processes each operation individually, avoiding loading the entire Dex into memory, which keeps memory consumption low.

The algorithm solves Dalvik’s performance loss and ART’s large patch‑size problems, at the cost of a modest increase in ROM usage (tens of MB), which WeChat deemed acceptable given modern device storage.

2. Challenges on Android N

After release, Huawei reported a crash that occurred only on Android N. The issue stemmed from Android N’s mixed‑mode compilation, which broke our Java‑based hot‑patch approach.

We traced the problem to the way ART validates odex files after OTA updates, leading to address mismatches.

3. Vendor OTA Challenges

Mi devices reported black screens and ANRs during WeChat startup after OTA updates. The root cause was OTA‑changed system images invalidating odex checksums, especially on ART where image offsets differ.

We introduced a split‑platform synthesis strategy: full‑Dex synthesis on Dalvik and small‑Dex synthesis on ART, reducing ROM impact and avoiding large patches on ART.

4. Other Technical Challenges

We also dealt with issues such as Xposed plugins causing pre‑verified class crashes on Dalvik and stale code on ART, and Dex reflection failures on certain Samsung devices. Our mitigations included detecting Xposed and clearing patches, and adding a runtime flag to verify successful Dex reflection.

Tinker v1.0 Summary

1. Performance

Patch size is typically under 10 KB.

Performance impact is negligible; the 2% overhead comes mainly from MD5 verification of the patch Dex.

ART’s split‑platform synthesis achieves ROM usage comparable to the Qzone solution.

2. Success Rate

Success Rate = Number of users who upgraded to the patch version / Number of users on the baseline version

Three days after release, 94.1% of baseline users successfully upgraded to the patch version. The Qzone‑based approach achieved about 96.3% in the same period, indicating room for improvement.

3. Tinker v2.0 – Pursuing Stability

While v1.0 addressed performance, stability and compatibility under high‑availability remain open challenges. v2.0 will focus on comprehensive monitoring, automatic rollback, and robust exception handling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AndroidTinkerDexDiffhot-patch
Tencent TDS Service
Written by

Tencent TDS Service

TDS Service offers client and web front‑end developers and operators an intelligent low‑code platform, cross‑platform development framework, universal release platform, runtime container engine, monitoring and analysis platform, and a security‑privacy compliance suite.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.