Mastering Node.js Backend Logging: Design, Tools, and Full‑Trace Strategies
This article shares a comprehensive guide to building robust logging systems for Node.js backend services, covering log types, storage options, performance considerations, full‑trace design, custom field schemas, integration with cloud log platforms, and practical troubleshooting examples.
Preface
Based on common industry log service scenarios, there are four main use cases:
Log and big‑data analysis for product data tracking and business feedback.
Log audit for data recovery, such as restoring databases after failures.
Problem diagnosis using full‑chain logs to locate faults across DB, HTTP, and other modules.
Operations management of server logs from third‑party modules like Redis, MySQL, Kafka for alerts and rapid recovery.
The article will discuss solution choices for these scenarios and include incident case studies.
Project Background
The backend service faces various logging challenges. The architecture diagram below illustrates the flow involving frontend calls, server calls, third‑party services, and storage.
Early in the project, the lack of a proper logging system forced developers to rely on intuition for troubleshooting, making each release stressful and driving the need for a comprehensive log reporting system.
Conventional Log Technology Selection
Node.js’s ecosystem now offers mature options for log reporting, analysis, and reporting. When selecting technologies, consider the factors shown in the diagram below.
Key technical factors include:
Storage location : Local logs aid debugging when containers or networks fail, while remote logs enable daily issue location and integration with tools like Elasticsearch and Kibana.
CPU usage : Log collection must be efficient; use log levels (trace, debug, info, warn, error, fatal) to filter and avoid excessive CPU or bandwidth consumption.
Retention time : Local logs are typically large (GB‑scale) and kept for >24 h; periodic transfer to remote storage extends retention to ~7 days. Critical business logs (e.g., payment conversion) should be persisted longer.
Memory/disk usage : Excessive log volume can fill container disks, causing crashes. Optimize log size and frequency.
Loss rate : Network instability can cause log loss; aim for minimal loss during high‑frequency logging.
Analysis capability : Choose log analysis frameworks that integrate smoothly with your reporting system.
Local Logs
The project stores detailed logs (database, request, pod runtime) in the dist/log directory, one file per worker with timestamps. Files are cleared every 24 h.
Local logs are the most detailed but consume significant CPU, I/O, and disk space, so developers should log only essential information to reduce resource usage.
Because local logs are not persistent, they are periodically uploaded to Tencent Cloud Log Service (CLS) via the internal container platform (TKEx‑CSIG). The platform’s log rules configure log sets, topics, and file directories.
Logs are then searchable and analyzable in the CLS console.
Operations Logs
Modern container platforms provide robust operations‑log systems. The internal platform offers a Grafana dashboard for real‑time monitoring of pod status, CPU, memory, disk, and HPA.
For deeper Node.js runtime insights (heap, flame graphs, snapshots), custom monitoring tools can be integrated.
Full‑Chain Front‑Back Log Design
Full‑chain logging links all request stages (0‑6) using a unique trace_id. Each request carries this ID in the X-Request-Id header, generated via UUID.
Middleware intercepts all requests and third‑party calls to propagate the trace_id. The diagram below shows the flow.
With this mechanism, developers can quickly locate issues across services.
Both frontend and backend logs are included in the full‑chain, which is uncommon for many backend services.
The project uses the imserver-monitor SDK for automatic reporting, integrating trace propagation, async request wrappers, queue management, and monitoring.
Scientific Use of Logs
Beyond troubleshooting, logs support business analysis and alerting. For example, container status alerts can be configured on the internal platform.
The internal IMLOG SDK integrates with Kibana for log retrieval and Grafana for visualization and alerts.
Custom Log Development
When designing custom logs, follow the field conventions of mature platforms to reduce analysis friction. Below is an example of a core field mapping used by the team.
{
"mapping": {
"k12": {
"properties": {
"additional": {"type": "text"},
"cmd": {"type": "keyword"},
"cost_time": {"type": "integer"},
"error_msg": {"type": "text"},
"ext": {"type": "keyword"},
"geoip": {"properties": {"location": {"type": "geo_point"}}},
"log_level": {"type": "keyword"},
"log_timestamp": {"type": "date", "format": "epoch_millis"},
"parent_span_id": {"type": "keyword"},
"project": {"type": "keyword"},
"request_ip": {"type": "ip"},
"return_code": {"type": "integer"},
"return_code_2": {"type": "keyword"},
"server_ip": {"type": "ip"},
"source": {"type": "keyword"},
"span_id": {"type": "keyword"},
"trace_id": {"type": "keyword"},
"uin": {"type": "long"},
"url": {"type": "keyword"},
"wns_code": {"type": "keyword"}
}
},
"_default_": {
"properties": {
"additional": {"type": "text"},
"cmd": {"type": "keyword"},
"cost_time": {"type": "integer"},
"error_msg": {"type": "text"},
"ext": {"type": "keyword"},
"geoip": {"properties": {"location": {"type": "geo_point"}}},
"log_timestamp": {"type": "date", "format": "epoch_millis"},
"project": {"type": "keyword"},
"request_ip": {"type": "ip"},
"return_code": {"type": "integer"},
"return_code_2": {"type": "keyword"},
"server_ip": {"type": "ip"},
"trace_id": {"type": "keyword"},
"uin": {"type": "long"},
"url": {"type": "keyword"},
"wns_code": {"type": "keyword"}
}
}
}
}Performance tuning is crucial; early use of JSON.stringify on large objects caused bottlenecks. Switching to fast-json-stringify and implementing toJSON methods improved log processing speed.
Conclusion
This concludes the author’s experience sharing on Node.js backend logging. Topics such as persistent storage of core logs, data export, and advanced analysis remain open for further exploration.
Hope this guide provides useful references for Node.js developers; feel free to discuss and share experiences in the comments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent IMWeb Frontend Team
IMWeb Frontend Community gathering frontend development enthusiasts. Follow us for refined live courses by top experts, cutting‑edge technical posts, and to sharpen your frontend skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
