Mastering Log Engineering: From Standards to ELK Visualization
This article explains why systematic logging is essential for production debugging, introduces a practical log classification and field schema, describes trace‑ID propagation and performance instrumentation, and walks through building an ELK‑based log collection, storage, and real‑time visualization platform for reliable observability.
Why Logging Matters and Common Pain Points
Logs are indispensable for troubleshooting production failures, yet developers often neglect proper log design, leading to chaotic output and missed critical information.
What Readers Should Gain
The article provides a concise, actionable log‑usage guide, covering log standards, classification, field definitions, traceability, performance instrumentation, and an end‑to‑end ELK implementation.
Log Classification (Reference)
System logs (OS kernel, daemon messages)
Language‑level logs (Java, PHP, Go, Python, etc.)
Application logs (framework internal logs and business logs)
Other logs (MySQL slow query, Redis command logs, middleware logs)
Log Standards (Reference)
A unified log format should be JSON, include mandatory fields, and be easy to parse by downstream tools.
Key Log Fields
time : required timestamp (e.g., 2018-03-15T13:41:08.123456Z)
level : optional log level
cat : log category (t_req for request, t_rsp for response, log for others)
app_name : application name plus version
trace_id : unique request identifier for end‑to‑end tracing
upspan : upstream service name
span_id : hierarchical call identifier (e.g., 0, 0.1, 0.1.1)
rt : response time in ms
e_uri : entry URI of the request
uri : request URI
code : HTTP status code
protocol : protocol name (http, thrift, …)
protocol_code : protocol‑level status code
verb : HTTP method
query : optional query string
ext : optional custom extension fields (e.g., ext.mem, ext.user_agent, ext.remote_addr)
Trace Headers
Four headers must be propagated for full request‑chain tracing: x-g7-trace-id , x-g7-upspan , x-g7-span-id , and x-g7-e-uri .
Performance Instrumentation (Tracing)
Any network I/O (REST API, MySQL, Redis, etc.) should be instrumented with generic SDK calls such as mq_kafka_consume or cache_redis_zCard to record component, operation name, total latency and call count.
ELK‑Based Log Platform Implementation
Overall Architecture
Logs are collected by agents (Filebeat, Logstash, Flume) into a Kafka cluster, processed by stream systems (Logstash, Storm, Spark), indexed into Elasticsearch for search and Kibana visualization, and optionally persisted to HDFS for offline analysis.
Log Data Flow
Log Collection
Docker containers write logs to stdout; a Logstash instance on each host captures the stdout stream and forwards it to Kafka.
Log Storage
All logs are pushed to Kafka, then consumed and indexed into Elasticsearch (for real‑time search) or written to HDFS (for batch analysis).
Log Visualization
Kibana dashboards provide real‑time monitoring, Top‑N slow‑interface charts, per‑node timeout statistics, service‑level availability, and custom charts for performance and quality analysis.
Practical Query Examples
Typical Lucene query strings include:
app_name:"sf-express-v1" AND uri:"/CarGps/Playback" trace_id:"7f015a2567f5004a4d3f"These queries enable fast retrieval of all logs related to a specific request or service.
Conclusion
Adopting a unified log specification and an ELK‑based pipeline makes fault isolation, service‑level metric collection, and automated observability straightforward across micro‑services architectures.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
G7 EasyFlow Tech Circle
Official G7 EasyFlow tech channel! All the hardcore tech, cutting‑edge innovations, and practical sharing you want are right here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
