How a Hidden NPE Revealed Deep Issues in Our Java Backend and RocketMQ Integration
After receiving a Sentry alert for a NullPointerException in a Java backend, the author traced the issue through user context handling with TransmittableThreadLocal, uncovered mismatched RocketMQ header propagation, multiple retry attempts, and a manual message injection, ultimately revealing how a missing header caused the NPE.
Preface
The company added extensive monitoring (interface response time, CPU, memory, error logs, etc.) and set up email alerts for abnormal situations, aiming to resolve online issues on the same day unless they are extremely tricky.
1. Cause
On a Monday morning, the author received an email forwarded by the manager about a NullPointerException (NPE) in production. The email, sent via sentry, linked directly to the Sentry detail page showing key information such as operation time, request interface, error location, and error message.
Visiting the Sentry page revealed the exact code line causing the NPE:
notify.setName(CurrentUser.getCurrent().getUserName());The author quickly located the line in the IDE, noted that the last modifier was a former colleague who had left a month ago, and concluded that the code lacked compatibility handling.
The problematic line simply retrieves the user name from the current user context and sets it into the notify entity, which is later persisted to the database. The notify field records the person who added the push notification, mainly for traceability during online issue investigations.
Note: The push notification mentioned here is a real‑time notification sent via a WebSocket long connection, different from the MQ message.
The CurrentUser class holds a ThreadLocal object to store user context. To ensure correct user information in thread pools, the project uses Alibaba's TransmittableThreadLocal:
@Data
public class CurrentUser {
private static final TransmittableThreadLocal<CurrentUser> THREA_LOCAL = new TransmittableThreadLocal<>();
private String id;
private String userName;
private String password;
private String phone;
...
public static void set(CurrentUser user) { THREA_LOCAL.set(user); }
public static CurrentUser getCurrent() { return THREA_LOCAL.get(); }
}A global Spring MVC interceptor sets the user context into the ThreadLocal based on the request token:
public class UserInterceptor extends HandlerInterceptorAdapter {
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
CurrentUser user = getUser(request);
if (Objects.nonNull(user)) {
CurrentUser.set(user);
}
return true;
}
}In the API service layer, business methods retrieve the user via CurrentUser.getCurrent(). However, the same business layer is also used by an MQ consumer service, which does not have a logged‑in user, leading to a null user context and the NPE.
The initial fix was to add a null‑check and fall back to a system user:
@Autowired
private BusinessConfig businessConfig;
CurrentUser user = CurrentUser.getCurrent();
if (Objects.nonNull(user)) {
entity.setUserId(user.getUserId());
entity.setUserName(user.getUserName());
} else {
entity.setUserId(businessConfig.getDefaultUserId());
entity.setUserName(businessConfig.getDefaultUserName());
}Repeating this check everywhere is cumbersome, prompting a search for a more elegant solution.
2. First Reversal
A global search for CurrentUser.set uncovered a RocketMQ AOP interceptor that sets user information from message headers before onMessage execution:
@Aspect
@Component
public class RocketMqAspect {
@Pointcut("execution(* onMessage(..) && @within(org.apache.rocketmq.spring.annotation.RocketMQMessageListener)")
public void pointcut() {}
@Around("pointcut")
public void around(ProceedingJoinPoint point) throws Throwable {
if (point.getArgs().length == 1 && point.getArgs()[0] instanceof MessageExt) {
MessageExt message = (MessageExt) point.getArgs()[0];
String userId = message.getUserProperty("userId");
String userName = message.getUserProperty("userName");
if (StringUtils.notEmpty(userId) && StringUtils.notEmpty(userName)) {
CurrentUser user = new CurrentUser();
user.setUserId(userId);
user.setUserName(userName);
CurrentUser.set(user);
}
}
// ...
}
}This interceptor injects user context into MQ consumers, but the problematic message lacked the required headers.
3. Second Reversal
The team contacted the upstream system that produced the MQ message. They claimed their local tests passed, yet the message still missed the user headers.
Investigation revealed a custom RocketMQTemplate that overrides asyncSend to add user headers before sending:
public class MyRocketMQTemplate extends RocketMQTemplate {
@Override
public void asyncSend(String destination, Message<?> message, SendCallback sendCallback, long timeout, int delayLevel) {
MessageBuilder builder = withPayload(message.getPayLoad());
CurrentUser user = CurrentUser.getCurrent();
builder.setHeader("userId", user.getUserId());
builder.setHeader("userName", user.getUserName());
super.asyncSend(destination, message, sendCallback, timeout, delayLevel);
}
}This design elegantly propagates user information via message headers.
4. Third Reversal
Further analysis showed that the upstream team called a three‑parameter overload of asyncSend instead of the five‑parameter version that adds headers. All overloads eventually delegate to the five‑parameter method, so calling the shorter overload bypassed the header‑setting logic.
Reminder: Overloaded methods may delegate to a core implementation; overriding the core method affects all overloads.
5. Fourth Reversal
Log inspection revealed that the problematic message was sent on 2021‑05‑21 but only succeeded on 2021‑05‑28 after five retry attempts. RocketMQ’s retry mechanism moved the message to a dead‑letter queue after the configured limit.
When manually sending MQ messages, ensure required headers are included, especially for RocketMQ.
6. Truth
It turned out that a manual message was sent from the RocketMQ console without headers to fix a failed approval status. The consumer processed the message, completed the business flow, but missed the final WebSocket notification because the notification code was added by the departed colleague and resides outside the main transaction.
Best practice: Keep non‑core code outside the main transaction to avoid affecting primary business logic.
Although the NPE affected only a single merchant’s notification, the incident provided valuable lessons and deeper familiarity with the system’s new features.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
