10 Logging Best Practices to Diagnose Production Issues Efficiently

This article presents ten practical rules for writing high‑quality logs—covering format consistency, stack traces, log levels, parameter completeness, asynchronous handling, traceability, dynamic configuration, structured storage, and intelligent monitoring—to help engineers quickly pinpoint problems in high‑traffic systems.

dbaplus Community
dbaplus Community
dbaplus Community
10 Logging Best Practices to Diagnose Production Issues Efficiently

Unified Log Format

Define a single pattern in logback.xml that adds timestamp, trace identifier, thread name, log level, logger name and the message. This makes every line self‑contained and searchable.

<!-- logback.xml core pattern -->
<pattern>%d{yy-MM-dd HH:mm:ss.SSS} |%X{traceId:-NO_ID} |%thread |%-5level |%logger{36} |%msg%n</pattern>

Exception Logging with Stack Traces

Always pass the caught exception object to the logging call so the full stack trace is recorded.

try {
    processOrder();
} catch (Exception e) {
    log.error("订单处理异常 orderId={}", orderId, e); // e must be supplied
}

Appropriate Log Levels

FATAL – system is about to crash (OOM, disk full).

ERROR – core business failure (payment failure, order creation error).

WARN – recoverable exception (retry succeeded, degradation triggered).

INFO – key process node (order status change).

DEBUG – debugging details (parameter flow, intermediate results).

Complete Parameters & Data Masking

Log all relevant context (e.g., user ID, IP, reason) and mask sensitive fields before output.

// Masking utility
public class LogMasker {
    public static String maskMobile(String mobile) {
        return mobile.replaceAll("(\\d{3})\\d{4}(\\d{4})", "$1****$2");
    }
}
// Usage example
log.info("用户注册 mobile={}", LogMasker.maskMobile("13812345678"));

Asynchronous Logging for Performance

Synchronous logging in high‑concurrency scenarios blocks threads, causing frequent context switches, disk I/O bottlenecks and up to 25 % of total response time.

<!-- AsyncAppender core configuration -->
<appender name="ASYNC" class="ch.qos.logback.classic.AsyncAppender">
    <!-- Discard TRACE/DEBUG when queue is low -->
    <discardingThreshold>0</discardingThreshold>
    <!-- Queue depth, recommend maxThreads * 2 -->
    <queueSize>4096</queueSize>
    <!-- Reference the real file appender -->
    <appender-ref ref="FILE"/>
</appender>
// Logging code – let the framework handle null checks
log.debug("接收到MQ消息:{}", msg.toSimpleString()); // async queue
// WRONG: expensive computation before logging
log.debug("详细内容:{}", computeExpensiveLog());
// Performance sizing formula
MaxMemory ≈ QueueLength × AvgLogSize
RecommendedQueue = PeakTPS × TolerableDelay(s)
// Example: 10 000 TPS × 0.5 s → 5 000 queue size

Monitor queue usage and alert when > 80 %.

Avoid OOM by limiting large toString() calls.

Expose a JMX switch to revert to synchronous mode instantly.

Traceability (Link Tracing)

Inject a traceId into MDC at the entry point of each request and include it in the log pattern. This enables end‑to‑end correlation across services.

// Inject traceId
MDC.put("traceId", UUID.randomUUID().toString().substring(0,8));
// Pattern with traceId
<pattern>%d{HH:mm:ss} |%X{traceId}| %msg%n</pattern>

Dynamic Log Level Adjustment (Hot Update)

Expose a lightweight HTTP endpoint to change a logger’s level at runtime without restarting the service.

@GetMapping("/logLevel")
public String changeLogLevel(@RequestParam String loggerName, @RequestParam String level) {
    Logger logger = (Logger) LoggerFactory.getLogger(loggerName);
    logger.setLevel(Level.valueOf(level)); // takes effect immediately
    return "OK";
}

Structured JSON Logging

Store logs as JSON objects so they are machine‑friendly and searchable.

{
  "event": "ORDER_CREATE",
  "orderId": 1001,
  "amount": 8999,
  "products": [{"name": "iPhone", "sku": "A123"}]
}

Intelligent Monitoring (ELK) and Alert Rules

Integrate logs with the ELK stack and define alert thresholds, e.g., trigger a phone alert when ERROR logs exceed 100 within 5 minutes, or send an email when WARN logs persist for an hour.

ERROR logs > 100 for 5 minutes → phone alert
WARN logs > continuous 1 hour → email notification
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringperformancelogginglogbackstructured logging
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.