Why SkyWalking’s ThreadPool Plugin Failed and How I Fixed It
This article explains the root cause of the thread‑pool plugin enhancement failure in SkyWalking—duplicate AgentClassLoader instances—detailing the investigation steps, code changes, and two practical solutions to ensure reliable instrumentation of ThreadPoolExecutor.
This article is based on a friend's work; he asked how to contribute code to SkyWalking, and I suggested writing an article after finishing.
1. Talk is cheap, show the code
Fixed the issue where the thread‑pool plugin enhancement failed in certain scenarios; the impatient can directly view the problem and code.
Below are the submitted issue and PR (already merged):
issue: https://github.com/apache/skywalking/issues/10925
pull request: https://github.com/apache/skywalking-java/pull/556
2. Background
Our company uses SkyWalking for full‑link tracing; the agent core is SkyWalking. Synchronous trace coverage is mostly provided by plugins, and the open‑source community already offers a thread‑pool enhancement plugin ( bootstrap-plugins\apm-jdk-threadpool-plugin.jar).
First, the bootstrap-plugins\apm-jdk-threadpool-plugin.jar needs to be moved to the plugins directory.
After the application starts, the thread‑pool enhancement fails: the traceId printed in logs from tasks submitted to the thread pool is empty.
Main environment:
JDK 1.8
SkyWalking agent 8.14.0
Spring Boot 2.2.6.RELEASE
3. Investigation Process
3.1 Discovery about AgentClassLoader
I noticed identical logs scanning JARs in the plugins directory both in the console and skywalking-api.log.
The scanning occurs in AgentClassLoader 's findClass method. Surprisingly, the scan runs twice, each time with a different AgentClassLoader instance.
Because bytecode enhancement relies on interceptors loaded by AgentClassLoader, I suspected the duplicate instantiation caused the failure.
I modified the source so that the second instantiation reuses the first AgentClassLoader instance.
After repackaging the agent, restarting the app, and observing the logs, the scan log appears only once and the thread‑pool enhancement works, showing matching traceId values in both the main thread and the executor thread.
public class InterceptorInstanceLoader {
public static <T> T load(String className, ClassLoader targetClassLoader) throws IllegalAccessException, InstantiationException, ClassNotFoundException, AgentPackageNotFoundException {
if (targetClassLoader == null) {
targetClassLoader = InterceptorInstanceLoader.class.getClassLoader();
}
String instanceKey = className + "_OF_" + targetClassLoader.getClass().getName() + "@" + Integer.toHexString(targetClassLoader.hashCode());
Object inst = INSTANCE_CACHE.get(instanceKey);
if (inst == null) {
INSTANCE_LOAD_LOCK.lock();
ClassLoader pluginLoader;
try {
pluginLoader = EXTEND_PLUGIN_CLASSLOADERS.get(targetClassLoader);
// ------------------- key code here -------------------
if (pluginLoader == null) {
pluginLoader = new AgentClassLoader(targetClassLoader);
EXTEND_PLUGIN_CLASSLOADERS.put(targetClassLoader, pluginLoader);
}
} finally {
INSTANCE_LOAD_LOCK.unlock();
}
inst = Class.forName(className, true, pluginLoader).newInstance();
if (inst != null) {
INSTANCE_CACHE.put(instanceKey, inst);
}
}
return (T) inst;
}
}Test code:
@GetMapping(value = "testHello")
public String testHello(@RequestParam("msg") String msg) {
log.info("testHello:{}, traceId:{}", msg, org.apache.skywalking.apm.toolkit.trace.TraceContext.traceId());
ThreadPoolExecutor executor = new ThreadPoolExecutor(10, 10, 1000, TimeUnit.HOURS, new ArrayBlockingQueue<>(10));
executor.submit(() -> {
log.info("executor.execute:{}, traceId:{}", msg, org.apache.skywalking.apm.toolkit.trace.TraceContext.traceId());
});
return "ok";
}Log output:
2023-06-21 20:38:55.359 INFO 20808 - [nio-8081-exec-1] c.ScController : testHello:111, traceId:753241bce942437a9cfd0daea7e21578.65.16873511352010001
2023-06-21 20:38:55.368 INFO 20808 - [pool-2-thread-2] c.ScController : executor.execute:111, traceId:753241bce942437a9cfd0daea7e21578.65.16873511352010001Using a singleton AgentClassLoader resolved the issue.
3.2 ThreadPoolExecutor class loading timing
With only the Tomcat plugin and the thread‑pool plugin, the enhancement succeeds, suggesting that other plugins may cause conflicts.
I added -XX:+TraceClassLoading to the JVM and observed that ThreadPoolExecutor is loaded after the transformer registration, so my earlier guess that it was loaded before was wrong.
Further debugging showed that the first instantiation of ThreadPoolExecutor occurs in SkyWalking's logging component ( FileWriter).
3.3 Guess about JVM transform logic
The JVM seems to avoid triggering transform for a class B that is loaded as a dependency during class A's transform, preventing recursive transform calls.
3.4 Problem summary
All doubts are cleared. The two questions from section 3.1 are answered:
Q: Why do official test cases pass?
A: Tests only enhance ThreadPoolExecutor alone, without third‑party plugin interference.
Q: If duplicate AgentClassLoader caused the failure, why don't other classes suffer?
A: Only ThreadPoolExecutor enhancement depends on a third‑party class during transform; other classes do not have this dependency.
4. Solution
Make AgentClassLoader a singleton to avoid duplicate JAR scans and the resulting ThreadPoolExecutor dependency.
Remove the logging component ( FileWriter) that depends on ThreadPoolExecutor and replace it with an alternative implementation (e.g., a simple thread loop).
The first approach was rejected after discussion, so the second approach was adopted, replacing the FileWriter thread‑pool with a custom thread‑loop implementation.
5. Related issues and discussions
https://github.com/apache/skywalking/issues/9425
https://github.com/apache/skywalking/issues/9850
https://github.com/apache/skywalking/issues/10374
https://github.com/apache/skywalking/issues/10685
https://github.com/apache/skywalking/discussions/10207
https://github.com/apache/skywalking/discussions/9888
Xiao Lou's Tech Notes
Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
