How to Build a Scalable Kubernetes Log System with ClickHouse and Fluent‑Bit
This article explains why Stone Docs switched from SLS/ES to ClickHouse for Kubernetes log storage, outlines the four‑stage architecture (collection, transmission, storage, management), compares DaemonSet, network, and SideCar collection methods, and provides concrete ClickHouse table definitions and Fluent‑Bit configurations for a production‑grade logging pipeline.
Background
Stone Docs runs all its applications on Kubernetes, generating massive log volumes. The original stack (SLS + Elasticsearch) suffered from high indexing costs, lack of per‑index cost visibility, excessive storage consumption, and insufficient timestamp precision (seconds only). After evaluating alternatives, ClickHouse was chosen for its low storage cost, millisecond precision, and ability to handle high‑throughput logs.
Architecture Overview
The logging system is divided into four components:
Log Collection : A DaemonSet‑deployed LogCollector mounts host log directories (e.g., /var/log/containers, /var/lib/docker/containers) and reads application, system, and K8s audit logs.
Log Transmission : Collected logs are routed to Kafka topics based on their structure.
Log Storage : ClickHouse stores logs using two engine tables and materialized views.
Log Management : The open‑source Mogo UI provides query, indexing, LogCollector configuration, ClickHouse table management, and alerting.
Log Collection Details
Three collection approaches are considered:
DaemonSet : Deploy LogCollector on every node, mount host directories, and read logs directly.
Network : Use an application‑level SDK to push logs to the collector.
SideCar : Run LogCollector inside each pod to capture only that pod’s logs.
Advantages of DaemonSet + file collection include better performance and richer metadata (pod, namespace, container name, container ID). However, many file‑tailing agents (fluent‑bit, filebeat) cannot append these labels when running as DaemonSets, so the team opted for DaemonSet with standard output collection.
Log Transmission
Fluent‑Bit forwards logs to Kafka. Different log structures are mapped to distinct Kafka topics, and corresponding ClickHouse tables are created for each topic (e.g., ingress_stdout_stream for the ingress‑stdout topic).
Log Storage in ClickHouse
Three table types are used:
Kafka Engine Table : Ingests raw JSON logs from Kafka.
Materialized View : Parses _log_ JSON fields and extracts columns such as status and url.
Result Table : Stores the final, query‑ready data with TTL of 7 days.
create table logger.ingress_stdout_stream (
_source_ String,
_pod_name_ String,
_namespace_ String,
_node_name_ String,
_container_name_ String,
_cluster_ String,
_log_agent_ String,
_node_ip_ String,
_time_ Float64,
_log_ String
) engine = Kafka SETTINGS
kafka_broker_list = 'kafka:9092',
kafka_topic_list = 'ingress-stdout',
kafka_group_name = 'logger_ingress_stdout',
kafka_format = 'JSONEachRow',
kafka_num_consumers = 1; CREATE MATERIALIZED VIEW logger.ingress_stdout_view TO logger.ingress_stdout AS
SELECT
toDateTime(toInt64(_time_)) AS _time_second_,
fromUnixTimestamp64Nano(toInt64(_time_*1000000000), 'Asia/Shanghai') AS _time_nanosecond_,
_pod_name_, _namespace_, _node_name_, _container_name_, _cluster_, _log_agent_, _node_ip_, _source_,
_log_ AS _raw_log_,
JSONExtractInt(_log_, 'status') AS status,
JSONExtractString(_log_, 'url') AS url
FROM logger.ingress_stdout_stream
WHERE 1=1; create table logger.ingress_stdout (
_time_second_ DateTime,
_time_nanosecond_ DateTime64(9, 'Asia/Shanghai'),
_source_ String,
_cluster_ String,
_log_agent_ String,
_namespace_ String,
_node_name_ String,
_node_ip_ String,
_container_name_ String,
_pod_name_ String,
_raw_log_ String,
status Nullable(Int64),
url Nullable(String)
) engine = MergeTree PARTITION BY toYYYYMMDD(_time_second_)
ORDER BY _time_second_
TTL toDateTime(_time_second_) + INTERVAL 7 DAY
SETTINGS index_granularity = 8192;Fluent‑Bit Configuration
The LogCollector uses Fluent‑Bit (CNCF project) with a ConfigMap managed via Mogo. The default data schema includes: @timestamp (string/float) – original log time (renamed to _time_ to avoid conflicts). log (string) – raw log content (renamed to _log_).
Because ClickHouse cannot handle fields prefixed with @, the pipeline maps them to _time_ and _log_. Example output: {"_time_":"2022-01-15...","_log_":"{\"id\":1}"} For non‑JSON logs, a Fluent‑Bit parser must be configured according to the official documentation.
End‑to‑End Flow Summary
Fluent‑Bit collects logs (via DaemonSet or network mode) and writes _time_ and _log_ fields to Kafka.
Kafka topics feed ClickHouse Kafka engine tables.
Materialized views parse JSON fields and populate result tables.
Mogo UI queries the result tables, allowing users to search, set indexes, and configure collectors.
Mogo UI Screenshots
Log query interface and collector configuration panels are shown below (images omitted for brevity).
References
GitHub repository: https://github.com/shimohq/mogo
Fluent‑Bit documentation: https://docs.fluentbit.io/
ClickHouse official site: https://clickhouse.com/
Additional articles on Kubernetes logging best practices are listed in the original source.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
