Databases 16 min read

Design and Implementation of a MySQL Traffic Sniffer for Full‑stack Access Logging

This article describes the background, functional requirements, architecture, packet‑capture implementation using libpcap, MySQL protocol parsing, result processing, and storage strategy (Kafka + ClickHouse) of a custom MySQL sniffer developed at Qunar to provide complete database access logs for auditing, performance analysis, and operations.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Design and Implementation of a MySQL Traffic Sniffer for Full‑stack Access Logging

Background – DBAs frequently receive requests such as “who modified this row?” or “who accessed this table?” and need full‑visibility logs. Traditional MySQL general/slow logs or audit plugins either miss critical fields (user, client IP) or impose heavy performance penalties, making comprehensive logging a challenge.

Functional requirements – The sniffer must achieve high capture success rate, auto‑discover new MySQL instances, store data in a query‑friendly format, be lightweight, easy to maintain, and run reliably on production servers.

Open‑source research – Existing MySQL packet‑capture tools were evaluated and found unsuitable, prompting a custom solution.

Program structure – The system captures traffic on a MySQL port, decodes the MySQL protocol, extracts fields such as client IP, username, database, query, execution time, rows affected, etc., and forwards the JSON records to Kafka.

Packet capture – Implemented with libpcap (the library behind tcpdump/wireshark). The capture flow is: NIC → driver → data link → IP → transport → application.

MySQL protocol parsing – The decoder handles login packets, command packets, and server response packets, extracting command type, query text, timestamps, and other metadata.

Result handling – After parsing, a JSON object containing fields like client_host , user , db , query , duration , etc., is produced. Example:

{
    "client_host": "192.168.225.219",
    "client_addr": "192.168.225.219:59154",
    "user": "h_qta_rw",
    "db": "qta_product_baseinfo",
    "rows": 1,
    "bytes_send": 625,
    "query": "select id, wrapper_id, real_phone, phone_type, did_phone, did_prefix from wrapper_did_phone where wrapper_id='hta10850s5i'",
    "thread_id": 3243074,
    "request_time": "2021-09-26T02:40:24.459756+08:00",
    "response_time": "2021-09-26T02:40:24.459874+08:00",
    "duration": "0.118 ms",
    "db_addr": "10.88.133.221:3306",
    "port": 3306,
    "host_name": "l-xxx.h.cn5",
    "tables": "qta_product_baseinfo.wrapper_did_phone",
    "checksum": "33353139414136414438343442394533",
    "fingerPrint": "select id, wrapper_id, real_phone, phone_type, did_phone, did_prefix from wrapper_did_phone where wrapper_id=?"
}

Storage strategy – Initially tried Kafka + Elasticsearch + Flink but faced latency and query‑speed issues. Switched to Kafka + ClickHouse, creating a Kafka Engine table, a materialized view, and a MergeTree table to store parsed records with TTL of 7 days.

CREATE TABLE sniffer.sniffer_kafka_queue_cn1 (
    `client_host` String,
    `client_addr` String,
    `user` String,
    `db` String,
    `rows` Int32,
    `bytes_send` UInt32,
    `query` String,
    `thread_id` UInt32,
    `request_time` String,
    `response_time` String,
    `error` String,
    `duration` String,
    `db_addr` String,
    `port` UInt32,
    `host_name` String,
    `tables` String,
    `checksum` String,
    `fingerPrint` String
) ENGINE = Kafka()
SETTINGS kafka_broker_list = 'l-kafka1.data.cn1:9092,...',
         kafka_topic_list = 'logs.logger.d_dba_prod',
         kafka_group_name = 'clickhouse-reader',
         kafka_format = 'JSONEachRow',
         kafka_max_block_size = 524288;

Subsequent ClickHouse tables and materialized view definitions aggregate the data for fast querying.

Development & Operations – Features such as automatic port discovery, heartbeat monitoring, runtime statistics, and garbage collection enable zero‑maintenance deployment across thousands of MySQL instances.

Problem analysis – Issues encountered include packet loss, memory leaks, parsing performance, and incomplete SQL parsing; these are being iteratively resolved.

Conclusion & Outlook – The sniffer now covers ~95% of production MySQL instances with >98% capture accuracy, supporting resource ownership analysis, SQL statistics, security auditing, and proactive DBA operations, effectively turning database management into data‑driven, intelligent processes.

KafkaClickHouseMySQLpacket capturesnifferdatabase auditinglibpcap
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.