Developing OpenTelemetry Instrumentation for PowerJob Using Java Agent and ByteBuddy
This article explains how to create OpenTelemetry instrumentation for the PowerJob distributed scheduler by implementing a Java agent with ByteBuddy, covering background, prerequisite knowledge, entry point discovery, version selection, implementation details, common pitfalls, Muzzle validation, and unit testing.
Background
Our company uses PowerJob as a distributed scheduling system and OpenTelemetry as the observability foundation, but OpenTelemetry does not yet provide support for PowerJob; only XXL‑JOB is supported. The article describes the development of PowerJob instrumentation to fill this gap.
Some developers in the company also have similar needs, which motivated the creation of the instrumentation.
The final effect is a complete trace chain that connects the gRPC consumer, Pulsar message, and the gRPC provider.
From the diagram we can see that grpc-consumer provides the scheduling entry, sends a Pulsar message, and finally calls the gRPC interface of the provider. This allows us to capture the JobId, parameters, and other data from PowerJob for easier debugging.
Prerequisite Knowledge for Instrumentation
Before writing the instrumentation, you need to understand the following concepts:
Using the existing gRPC instrumentation as an example, we see that a library module is added.
There are two main ways to instrument a library:
Library instrumentation
Java agent instrumentation
When instrumenting a framework or library, the first step is to locate its entry point. For gRPC, both client and server interceptors are available:
io.grpc.ClientInterceptor
io.grpc.ServerInterceptorWe can add tracing logic to these interceptors. For example, the client side uses io.opentelemetry.instrumentation.grpc.v1_6.TracingClientInterceptor :
The code resides in the grpc-1.6/library module, allowing users who do not want to use a javaagent to manually include the library and still get tracing.
implementation(project(":instrumentation:grpc-1.6:library"))If a library does not expose an extension API, we must fall back to byte‑code manipulation via a Java agent, which works for any Java code.
PowerJob does not provide extension interfaces, so only agent‑based instrumentation is possible.
Finding the Instrumentation Entry Point
Understanding the core logic of the target library is essential. For PowerJob, the core execution logic is in the process method of a class implementing BasicProcessor :
public class TestBasicProcessor implements BasicProcessor {
@Override
public ProcessResult process(TaskContext context) throws Exception {
System.out.println("======== BasicProcessor#process ========");
System.out.println("TaskContext: " + JsonUtils.toJSONString(context) + ";time = " + System.currentTimeMillis());
return new ProcessResult(true, System.currentTimeMillis() + "success");
}
}This method is the ideal place to start and end a span, injecting the TaskContext data into OpenTelemetry.
Choosing the Supported Version
After locating the entry point, we must decide which PowerJob versions to support. The author chose PowerJob:4.0+ because version 4.0 introduced a large refactor that kept the method signature stable.
Versions prior to 4.0 are not supported; interested readers can implement compatibility themselves.
Logic Implementation
The first step is to create an InstrumentationModule :
@AutoService(InstrumentationModule.class)
public class PowerJobInstrumentationModule extends InstrumentationModule {
public PowerJobInstrumentationModule() {
super("powerjob", "powerjob-4.0");
}
@Override
public List
typeInstrumentations() {
return asList(new BasicProcessorInstrumentation());
}
}The core of the instrumentation is BasicProcessorInstrumentation , which matches the process method and applies advice:
public class BasicProcessorInstrumentation implements TypeInstrumentation {
@Override
public ElementMatcher
typeMatcher() {
return implementsInterface(named("tech.powerjob.worker.core.processor.sdk.BasicProcessor"));
}
@Override
public void transform(TypeTransformer transformer) {
transformer.applyAdviceToMethod(
named("process").and(isPublic()).and(takesArguments(1)),
BasicProcessorInstrumentation.class.getName() + "$ProcessAdvice");
}
}The advice creates a span before the method executes and ends it after the method returns or throws:
public static class ProcessAdvice {
@Advice.OnMethodEnter(suppress = Throwable.class)
public static void onSchedule(@Advice.This BasicProcessor handler,
@Advice.Argument(0) TaskContext taskContext,
@Advice.Local("otelRequest") PowerJobProcessRequest request,
@Advice.Local("otelContext") Context context,
@Advice.Local("otelScope") Scope scope) {
Context parentContext = currentContext();
request = PowerJobProcessRequest.createRequest(taskContext.getJobId(), handler, "process");
request.setInstanceParams(taskContext.getInstanceParams());
request.setJobParams(taskContext.getJobParams());
context = helper().startSpan(parentContext, request);
if (context == null) return;
scope = context.makeCurrent();
}
@Advice.OnMethodExit(onThrowable = Throwable.class, suppress = Throwable.class)
public static void stopSpan(@Advice.Return ProcessResult result,
@Advice.Thrown Throwable throwable,
@Advice.Local("otelRequest") PowerJobProcessRequest request,
@Advice.Local("otelContext") Context context,
@Advice.Local("otelScope") Scope scope) {
helper().stopSpan(result, request, throwable, scope, context);
}
}The PowerJobExperimentalAttributeExtractor adds job‑specific attributes to the span:
class PowerJobExperimentalAttributeExtractor implements AttributesExtractor
{
@Override
public void onStart(AttributesBuilder attributes, Context parentContext, PowerJobProcessRequest req) {
attributes.put(POWERJOB_JOB_ID, req.getJobId());
attributes.put(POWERJOB_JOB_PARAM, req.getJobParams());
attributes.put(POWERJOB_JOB_INSTANCE_PARAM, req.getInstanceParams());
attributes.put(POWERJOB_JOB_INSTANCE_TRPE, req.getJobType());
}
}These attributes are registered when building the Instrumenter so that OpenTelemetry automatically records them.
Some Pitfalls
Although the instrumentation code is straightforward, submitting a PR requires passing a strict CI pipeline. Common issues include DSL mismatches (Groovy vs Kotlin), module naming conventions, and Muzzle validation.
Creating the Module
Use Kotlin as the Gradle DSL; otherwise, build errors may occur.
Module Naming
The module name must match the version string used in PowerJobInstrumentationModule , e.g., "powerjob-4.0" for version 4.0.
Muzzle Validation
Muzzle ensures that the javaagent does not conflict with runtime dependencies. Example configuration:
muzzle {
pass {
group.set("tech.powerjob")
module.set("powerjob-worker")
versions.set("[4.0.0,")
assertInverse.set(true)
extraDependency("tech.powerjob:powerjob-official-processors:1.1.0")
}
}This configuration supports PowerJob 4.0+ and explicitly excludes earlier versions.
Unit Testing
Unit tests simulate the core processor execution and assert that the expected spans and attributes are produced. The test also verifies compatibility with the latest PowerJob version defined by versions.set("[4.0.0,") .
@Test
void testBasicProcessor() throws Exception {
long jobId = 1;
String jobParam = "abc";
TaskContext taskContext = genTaskContext(jobId, jobParam);
BasicProcessor testBasicProcessor = new TestBasicProcessor();
testBasicProcessor.process(taskContext);
testing.waitAndAssertTraces(trace -> {
trace.hasSpansSatisfyingExactly(span -> {
span.hasName(String.format("%s.process", TestBasicProcessor.class.getSimpleName()));
span.hasKind(SpanKind.INTERNAL);
span.hasStatus(StatusData.unset());
span.hasAttributesSatisfying(attributeAssertions(TestBasicProcessor.class.getName(), jobId, jobParam, BASIC_PROCESSOR));
});
});
}When the CI runs against newer PowerJob versions (e.g., 5.1.0), missing classes or changed signatures cause failures, prompting updates to the test code.
Conclusion
The entire instrumentation development process is relatively simple once you understand the target library's internals. The real challenge lies in passing the community's rigorous CI checks, but detailed logs and build scans make troubleshooting manageable.
References:
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/CONTRIBUTING.md
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/contributing/writing-instrumentation.md
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.