Request Log Analysis System: Collected Fields, Derived Data, and Metrics
This article outlines a request log analysis system that records core request fields, adds proxy‑related data, derives IP‑based ASN and geographic information, parses user‑agent details, and provides comprehensive metrics such as PV/QPS, UV, traffic, latency, status monitoring, and business‑specific insights, all visualized via an ELK‑Kafka architecture.
The request log analysis system records a set of core fields for each request, including time_local (request time), remote_addr (client IP), request_method, request_schema (http/https), request_host, request_path, request_query, request_size, referer, user_agent, status, request_time, and bytes_sent.
When a load‑balancing gateway proxies the request to a backend service, additional fields are logged: upstream_host, upstream_addr, upstream_url, upstream_status, and proxy_time.
Derived data from the client IP includes ASN information ( asn_asn – autonomous system number, as_org – organization) and geographic details such as geo_location (latitude/longitude), geo_country, geo_country_code, geo_region (province), and geo_city.
The user_agent string can be parsed to obtain ua_device (device type), ua_os (operating system), and ua_name (browser).
Key analysis metrics include:
PV/QPS – page views and queries per second.
UV – unique visitors, identified by the combination of IP and user‑agent.
IP count – number of distinct source IPs.
Network traffic – inbound traffic calculated from request_size, outbound traffic from bytes_sent.
Referer source analysis.
Geographic analysis using the derived geo fields.
Device analysis based on parsed user‑agent data.
Request latency statistics using request_time (p99, p95, p90) and long‑latency alerts.
Response status monitoring using status (distribution of status codes, 5xx error count).
Business‑specific analysis: correlating request_path and request_query to understand actions such as album access ( /album/:id) or site search ( /search?q=<keyword>).
The typical architecture employs the ELK stack together with Kafka: Beats and Logstash collect and forward logs, Kafka buffers them for consumption, Elasticsearch aggregates the data, and Grafana/Kibana provide visual dashboards.
System Architect Go
Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
