Mastering Log Practices: From Rookie Mistakes to Expert Observability
This article walks developers through common logging pitfalls, explains three maturity levels of log implementation, and provides concrete Java examples and best‑practice techniques such as structured JSON logs, MDC trace IDs, and log‑bomb avoidance to turn logs into a powerful observability tool.
Act 1: Rookie Logging Pitfalls
During a holiday the author received a "disk space insufficient" alert caused by an ever‑growing log file that had not been cleaned up. The incident sparked a reflection on how logging—seemingly trivial for front‑end, back‑end, or client code—can quickly become a root cause of outages if mishandled.
Problem 1: Exceptions Swallowed
In the catch block the exception is eaten without printing a stack trace or re‑throwing, leaving no trace for debugging.
Problem 2: No Key Information
A log like OrderService#order process error! provides no order ID, user ID, or business context, making it impossible to locate the offending request among thousands per second.
Problem 3: Exception Details Missing
The log only says error without indicating whether it is an NPE, timeout, or RPC failure.
Act 2: The Three Levels of Logging
From junior engineers to senior experts, logging evolves through three distinct stages.
Level 1: P4 Junior – "Graffiti" Logs
Everything is printed with System.out.println() or e.printStackTrace().
Log messages are arbitrary, e.g., log.info("111") or log.info("Reached here").
String concatenation is used to build messages, wasting CPU even when the log level is disabled.
Performance killer: String concatenation executes regardless of log level, slowing high‑concurrency systems.
Information loss: Only e.getMessage() is logged, discarding the stack trace.
Zero value: Logs cannot be filtered, aggregated, or correlated, leaving engineers staring at raw text during incidents.
Level 2: P5 Intermediate – "Business Ledger" Logs
Engineers start using a logging façade (SLF4J/Logback) and differentiate INFO, WARN, ERROR, but still miss crucial context.
@Service
public class OrderService {
public void createOrder(OrderDTO order) {
try {
// ...business logic...
String userName = null;
userName.toLowerCase(); // NPE here
} catch (Exception e) {
// No log, just re‑throw a vague BizException
throw new BizException("Create order failed");
}
}
}
@RestController
public class OrderController {
@Autowired
private OrderService orderService;
@PostMapping("/orders")
public void createOrder(@RequestBody OrderDTO order) {
try {
orderService.createOrder(order);
} catch (BizException e) {
// Log without stack trace or detailed context
log.error("Handle create order request failed!", e);
}
}
}Typical issues at this level include missing trace IDs and insufficient business identifiers, leading to isolated log fragments.
Level 3: P6/P7 Expert – "SkyNet" Logs
Experts treat logs as observability assets, aiming for structured, contextual, and actionable data.
Practice 1: Structured Logging
Logs are emitted as JSON, containing fields such as timestamp, trace_id, span_id, error_code, and business IDs, enabling powerful queries in SLS/ELK.
log.error("{\"event\":\"order_creation_failed\",\"order_id\":\"{}\",\"user_id\":\"{}\",\"error\":\"{}\"}", orderId, userId, e.getMessage());Practice 2: MDC & trace_id
Using Mapped Diagnostic Context (MDC) to attach a trace_id to every log line, making it easy to reconstruct a request’s full call chain across micro‑services.
public class TraceInterceptor implements HandlerInterceptor {
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
String traceId = UUID.randomUUID().toString();
MDC.put("trace_id", traceId);
return true;
}
}Later logs automatically include the trace ID, e.g.,
log.error("Order failed, orderId: {}", orderId); // trace_id added by MDC.
Practice 3: Avoid Log Bombs
Never log massive objects or call toString() on high‑frequency data; instead log only essential IDs. Sample low‑volume INFO logs (e.g., 1% of events) to prevent log‑pipeline overload.
"Don't underestimate the impact of a single oversized log entry; in high‑traffic scenarios it can consume 95% of CPU and cripple the cluster."
When a log‑based alert detects an error‑rate spike, the system can automatically trigger DingTalk notifications or even run remediation scripts.
"A mature logging system is not just a recorder but a core pillar of system observability, providing global visibility, concise data presentation, and automated response capabilities."
In summary, progressing from ad‑hoc prints to structured, traceable, and automated logs transforms logging from a passive record into an active safeguard.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
