Investigation of a Kafka Consumer Offset Anomaly Caused by Native C++ Memory Modification
This article details a Kafka consumer issue where the offset unexpectedly becomes a large timestamp due to a native C++ method altering JVM memory, explains the debugging steps, reproduces the bug with Java code, and highlights the risks of unsafe native interactions.
In this article we describe a Kafka consumer anomaly where the offset unexpectedly becomes a large timestamp value, leading to no messages being consumed.
The issue originated from a native C++ method invoked via JNI that directly modified a memory address, corrupting a JVM cached Integer object.
We first observed the problem through log output showing the offset set to 1590039403, then reproduced it with a simple Java test that prints an Integer variable which suddenly changes to the same value.
public void handleMessage() {
Properties properties = new Properties();
// 添加若干配置....
KafkaConsumer
consumer = new KafkaConsumer<>(properties, new StringDeserializer(), new StringDeserializer());
String topic = "foo_bar";
consumer.subscribe(Collections.singleton(topic));
while (!stopMark) {
ConsumerRecords
records = consumer.poll(Duration.of(300, MILLIS));
for (ConsumerRecord
record : records) {
// 处理消息
}
}
}To investigate why the default offset of 0 turned into the strange data, we wrote the following test that replaces the consumer code:
Thread t = new Thread(() -> {
Integer a = 0;
while (true) {
System.out.println(a);
try {
TimeUnit.SECONDS.sleep(1);
} catch (Throwable ignore) {
}
}
});
t.start();The output showed the variable suddenly changing to 1590039403, matching the unexpected offset.
Further investigation revealed that the native method public native int GetConfFile(String var1, int var2, StringBuffer var4); returns its second argument by reference, and the C++ implementation overwrites the memory of cached Integer objects.
To demonstrate the effect we used sun.misc.Unsafe to directly modify the internal value field of an Integer instance, confirming that changing the memory of one Integer changes all cached zero‑valued Integers.
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException, InterruptedException {
Thread t = new Thread(() -> {
Integer a = 0;
while (true) {
System.out.println(a);
try {
TimeUnit.SECONDS.sleep(1);
} catch (Throwable ignore) {
}
}
});
t.start();
TimeUnit.SECONDS.sleep(5);
Integer b = 0;
// 获取Unsafe的实例
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
Unsafe unsafe = (Unsafe) f.get(null);
// 获取对象的字段
Field field = Integer.class.getDeclaredField("value");
// 计算字段在对象中的偏移量
long offset = unsafe.objectFieldOffset(field);
// 修改字段的值
unsafe.putInt(b, offset, 1);
System.out.println("+++++" + unsafe.getInt(b, offset));
}The root cause is the JVM’s Integer cache being altered by native code, which is independent of JDK version.
We conclude that interacting Java with native C++ requires extreme caution, as subtle bugs can corrupt JVM internals.
In the Q&A section we summarize the four‑step reasoning behind the cache modification and confirm that the issue reproduces on JDK 17.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.