10 Logging Best Practices to Diagnose Production Issues Efficiently
This article presents ten practical rules for writing high‑quality logs—covering format consistency, stack traces, log levels, parameter completeness, asynchronous handling, traceability, dynamic configuration, structured storage, and intelligent monitoring—to help engineers quickly pinpoint problems in high‑traffic systems.
Unified Log Format
Define a single pattern in logback.xml that adds timestamp, trace identifier, thread name, log level, logger name and the message. This makes every line self‑contained and searchable.
<!-- logback.xml core pattern -->
<pattern>%d{yy-MM-dd HH:mm:ss.SSS} |%X{traceId:-NO_ID} |%thread |%-5level |%logger{36} |%msg%n</pattern>Exception Logging with Stack Traces
Always pass the caught exception object to the logging call so the full stack trace is recorded.
try {
processOrder();
} catch (Exception e) {
log.error("订单处理异常 orderId={}", orderId, e); // e must be supplied
}Appropriate Log Levels
FATAL – system is about to crash (OOM, disk full).
ERROR – core business failure (payment failure, order creation error).
WARN – recoverable exception (retry succeeded, degradation triggered).
INFO – key process node (order status change).
DEBUG – debugging details (parameter flow, intermediate results).
Complete Parameters & Data Masking
Log all relevant context (e.g., user ID, IP, reason) and mask sensitive fields before output.
// Masking utility
public class LogMasker {
public static String maskMobile(String mobile) {
return mobile.replaceAll("(\\d{3})\\d{4}(\\d{4})", "$1****$2");
}
}
// Usage example
log.info("用户注册 mobile={}", LogMasker.maskMobile("13812345678"));Asynchronous Logging for Performance
Synchronous logging in high‑concurrency scenarios blocks threads, causing frequent context switches, disk I/O bottlenecks and up to 25 % of total response time.
<!-- AsyncAppender core configuration -->
<appender name="ASYNC" class="ch.qos.logback.classic.AsyncAppender">
<!-- Discard TRACE/DEBUG when queue is low -->
<discardingThreshold>0</discardingThreshold>
<!-- Queue depth, recommend maxThreads * 2 -->
<queueSize>4096</queueSize>
<!-- Reference the real file appender -->
<appender-ref ref="FILE"/>
</appender> // Logging code – let the framework handle null checks
log.debug("接收到MQ消息:{}", msg.toSimpleString()); // async queue
// WRONG: expensive computation before logging
log.debug("详细内容:{}", computeExpensiveLog()); // Performance sizing formula
MaxMemory ≈ QueueLength × AvgLogSize
RecommendedQueue = PeakTPS × TolerableDelay(s)
// Example: 10 000 TPS × 0.5 s → 5 000 queue sizeMonitor queue usage and alert when > 80 %.
Avoid OOM by limiting large toString() calls.
Expose a JMX switch to revert to synchronous mode instantly.
Traceability (Link Tracing)
Inject a traceId into MDC at the entry point of each request and include it in the log pattern. This enables end‑to‑end correlation across services.
// Inject traceId
MDC.put("traceId", UUID.randomUUID().toString().substring(0,8));
// Pattern with traceId
<pattern>%d{HH:mm:ss} |%X{traceId}| %msg%n</pattern>Dynamic Log Level Adjustment (Hot Update)
Expose a lightweight HTTP endpoint to change a logger’s level at runtime without restarting the service.
@GetMapping("/logLevel")
public String changeLogLevel(@RequestParam String loggerName, @RequestParam String level) {
Logger logger = (Logger) LoggerFactory.getLogger(loggerName);
logger.setLevel(Level.valueOf(level)); // takes effect immediately
return "OK";
}Structured JSON Logging
Store logs as JSON objects so they are machine‑friendly and searchable.
{
"event": "ORDER_CREATE",
"orderId": 1001,
"amount": 8999,
"products": [{"name": "iPhone", "sku": "A123"}]
}Intelligent Monitoring (ELK) and Alert Rules
Integrate logs with the ELK stack and define alert thresholds, e.g., trigger a phone alert when ERROR logs exceed 100 within 5 minutes, or send an email when WARN logs persist for an hour.
ERROR logs > 100 for 5 minutes → phone alert
WARN logs > continuous 1 hour → email notificationSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
