Mobile Development 21 min read

Optimizing Android Threads with Bytecode Instrumentation and Proxy Pools

This article presents a comprehensive approach to reducing thread overhead in Android applications by detecting, counting, and consolidating thread creation through bytecode instrumentation, proxy thread pools, and stack size trimming, ultimately improving memory usage, CPU load, and UI smoothness.

Huolala Tech
Huolala Tech
Huolala Tech
Optimizing Android Threads with Bytecode Instrumentation and Proxy Pools

Background

During normal APP development, second‑party and third‑party libraries are often used, which can increase thread count, cause thread contention, and block the main thread.

Thread count increase: libraries may start background threads.

Thread competition: multiple threads run simultaneously.

Thread blocking: libraries may block the main thread.

Overall Idea

To solve the thread pressure caused by these libraries, the following steps are taken:

Thread detection , evaluate optimization space.

Thread statistics , collect data.

Thread and thread‑pool optimization , converge thread numbers.

Thread stack trimming , reduce thread memory.

Specific Solution

1. Thread Detection

The most common ways to obtain thread information are reading the pseudo‑file system under /proc/self/task. The following Kotlin function reads thread status files and extracts name, pid, and state.

private fun getThreadInfoList(): List<ThreadInfo>? {
    val file = File("/proc/self/task")
    // iterate task directories, read status file, parse name, pid, state
    // (code omitted for brevity)
    return threadInfoList
}

After the app starts, a polling task reads the pseudo‑files, writes data to a database, and updates the UI.

Thread detection diagram
Thread detection diagram

2. Thread Statistics

Statistics show created, available, and running threads. Ideally, available threads should be close to running threads; the observed gap indicates optimization potential.

Thread statistics diagram
Thread statistics diagram

3. Thread and Thread‑Pool Optimization

3.1 Thread Optimization

For business code and self‑developed SDKs, replace direct new Thread with thread‑pool usage and give threads meaningful names.

For third‑party SDKs, rename threads via instrumentation (name length < 16 characters).

3.2 Thread‑Pool Optimization

Provide common thread pools (IO, CPU, Single, Cache) for the app layer.

Allow self‑developed SDKs to use a configurable thread pool.

For third‑party SDKs, first check if they expose a custom thread‑pool interface; otherwise, instrument them.

Instrumentation is performed with ASM. The following visitor detects thread‑pool creation:

class ThreadPoolDetectorClassVisitor extends ClassVisitor {
    @Override
    MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
        MethodVisitor mv = cv.visitMethod(access, name, desc, signature, exceptions);
        if (filterClass(className)) {
            return mv;
        }
        return new ProxyThreadPoolMethodVisitor(Opcodes.ASM6, mv, className);
    }
    private boolean filterClass(String className) {
        return className.contains("com/lalamove/threadtracker/")
            || className.contains("com/lalamove/plugins/thread")
            || className.contains("com/tencent/tinker/loader")
            || className.contains("com/lalamove/huolala/client/asm/HllPrivacyManager");
    }
}

The method visitor rewrites constructors of java/util/concurrent/ThreadPoolExecutor to use a custom proxy class BaseProxyThreadPoolExecutor and records the originating class name.

class ProxyThreadPoolMethodVisitor extends MethodVisitor {
    @Override
    public void visitMethodInsn(int opcode, String owner, String name, String descriptor, boolean isInterface) {
        if (PluginUtils.getScanProject()) {
            if (owner.equals(O_ThreadPoolExecutor) && name.equals("<init>")) {
                PluginUtils.writeClassNameToFile("Create ThreadPoolExecutor class: " + className);
            }
        }
        if (mClassProxy) {
            if (owner.equals(O_ThreadPoolExecutor) && name.equals("<init>")) {
                if (descriptor.equals("(IIJLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/BlockingQueue;)V")) {
                    mv.visitLdcInsn(className);
                    mv.visitMethodInsn(opcode, O_BaseProxyThreadPoolExecutor, name,
                        "(IIJLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/BlockingQueue;Ljava/lang/String;)V", false);
                } else if (descriptor.equals("(IIJLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/BlockingQueue;Ljava/util/concurrent/ThreadFactory;)V")) {
                    // similar rewrite with ThreadFactory
                }
                // other constructors omitted for brevity
                return;
            }
        }
        super.visitMethodInsn(opcode, owner, name, descriptor, isInterface);
    }
}

The proxy thread‑pool class overrides constructors, decides whether to proxy based on a whitelist, and forwards execute, submit, and shutdown methods to the original pool when needed.

open class BaseProxyThreadPoolExecutor : ThreadPoolExecutor {
    private var mProxy = true
    private val threadPoolExecutor: ThreadPoolExecutor = TrackerUtils.getProxyNetThreadPool()
    constructor(corePoolSize: Int, maximumPoolSize: Int, keepAliveTime: Long,
                unit: TimeUnit?, workQueue: BlockingQueue<Runnable>?, className: String?) :
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue) {
        init(corePoolSize, maximumPoolSize, keepAliveTime, className)
    }
    private fun init(corePoolSize: Int, maximumPoolSize: Int, keepAliveTime: Long, className: String?) {
        if (className != null) {
            mProxy = TrackerUtils.isProxy(className)
        }
        if (corePoolSize == 1 || (corePoolSize == 0 && maximumPoolSize == 1)) {
            mProxy = false
        }
        if (!mProxy) return
        // adjust keep‑alive, enable core thread timeout, etc.
    }
    override fun submit(task: Runnable): Future<*> {
        return if (mProxy) threadPoolExecutor.submit(task) else super.submit(task)
    }
    // other overrides omitted for brevity
}

A custom ThreadFactory creates threads with a reduced stack size (‑512 KB) to trim memory consumption:

open class ProxyThreadFactory : ThreadFactory {
    override fun newThread(runnable: Runnable): Thread {
        val atomic = AtomicInteger(1)
        return Thread(null, runnable, "Thread-" + atomic.getAndIncrement(), -512 * 1024)
    }
}

3.2 Implementation Steps

Create NewThreadTrackerPlugin to read a whitelist and register ThreadTrackerTransform.

Implement ThreadTrackerTransform to apply the class visitors to all .class files.

Implement ThreadTrackerClassVisitor and ProxyThreadPoolMethodVisitor as shown above.

Implement BaseProxyThreadPoolExecutor with proxy logic.

Provide a thread_tracker.gradle file at the project root to configure the whitelist and a “scanProject” flag.

Implementation diagram
Implementation diagram

4. Thread Stack Trimming

Android threads default to a 1 MiB stack. The function FixStackSize adds the requested size (or 0) to the default and optionally adds extra bytes for explicit stack‑overflow checks.

static size_t FixStackSize(size_t stack_size) {
    if (stack_size == 0) {
        stack_size = Runtime::Current()->GetDefaultStackSize();
    }
    stack_size += 1 * MB;
    if (Runtime::Current()->ExplicitStackOverflowChecks()) {
        stack_size += GetStackOverflowReservedBytes(kRuntimeISA);
    } else {
        stack_size += 8 * kStackOverflowImplicitCheckSize;
    }
    return stack_size;
}

By creating threads with -512 KB stack size in the custom ProxyThreadFactory, the effective stack is reduced to 512 KB, saving memory without affecting functionality.

5. Benefits and Pitfalls

Benefits

Thread count reduced from 197 to 152 (≈40 threads).

Memory usage decreased by ~20 MiB.

CPU usage lowered from 34.83 % to 31.51 %.

UI frame rate increased from 23.36 fps to 36.3 fps (≈13 fps gain).

Performance comparison
Performance comparison

Pitfalls

Separate network and local task threads to avoid blocking.

Avoid proxying inter‑dependent thread pools that could cause deadlocks.

Do not proxy pools with a core size of 1, as the optimization gain is minimal and may increase thread usage.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Androidthread poolthread optimizationbytecode instrumentation
Huolala Tech
Written by

Huolala Tech

Technology reshapes logistics

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.