Build a ClickHouse Log Visualization Pipeline with ClickVisual, Kafka & Categraf
This guide walks through installing and configuring an end‑to‑end log collection and visualization stack using ClickHouse for storage, ClickVisual for UI, Kafka as the transport, and Categraf as the collector, covering component setup, configuration files, and query examples.
Introduction
Many internet companies store logs in ClickHouse but lack a visual analysis tool. ClickVisual, an open‑source lightweight log query, analysis, and alerting platform, fills this gap. This article documents a hands‑on experience of building a complete log pipeline with ClickHouse, ClickVisual, Kafka, and Categraf.
Architecture Overview
ClickVisual is a web front‑end that queries ClickHouse tables; it does not handle log ingestion. The ingestion pipeline consists of a log agent (Categraf), Kafka, and ClickHouse. ClickVisual only adjusts ClickHouse table structures to control how logs are processed.
Component Installation
Kafka & Zookeeper
Download Kafka 3.6.0 (includes Zookeeper) and extract it.
./bin/zookeeper-server-start.sh -daemon config/zookeeper.propertiesStart Kafka:
nohup bin/kafka-server-start.sh config/server.properties &> kafka.stdout &Verify both processes with jps or netstat -tunlp.
Kowl (Kafka UI)
Run Kowl in a host‑network container to view topics:
docker run --network=host -d -e KAFKA_BROKERS=localhost:9092 quay.io/cloudhut/kowl:masterKowl listens on port 8080.
ClickHouse
Install ClickHouse via the official script:
curl https://clickhouse.com/ | sh
./clickhouse install
clickhouse startSet the initial password (e.g., 1234).
ClickVisual
Download the latest release (v1.0.0‑rc9) and extract it.
mkdir clickvisual && cd clickvisual
wget https://github.com/clickvisual/clickvisual/releases/download/v1.0.0-rc9/clickvisual-v1.0.0-rc9-linux-amd64.tar.gz
tar zxvf clickvisual-v1.0.0-rc9-linux-amd64.tar.gz
./clickvisual --helpClickVisual depends on MySQL and Redis; set them up beforehand.
Configure ClickVisual (example config/default.toml) with MySQL and Redis credentials:
[redis]
debug = true
addr = "127.0.0.1:6379"
writeTimeout = "3s"
password = ""
[mysql]
debug = true
dsn = "root:1234@tcp(127.0.0.1:3307)/clickvisual?charset=utf8mb4&collation=utf8mb4_general_ci&parseTime=True&loc=Local&readTimeout=1s&timeout=1s&writeTimeout=3s"
level = "debug"
maxIdleConns = 5
maxOpenConns = 10
connMaxLifetime = "300s"Start ClickVisual server:
nohup ./clickvisual server &> stdout.log &It listens on port 19011.
Categraf (Log Collector)
Download Categraf v0.3.38, extract, and keep only the binary and conf/logs.toml. Remove all metric‑related inputs and traces.yaml.
Disable heartbeat:
[heartbeat]
enable = falseConfigure logs.toml to collect a JSON log file and forward it to Kafka:
[logs]
api_key = "x"
enable = true
send_to = "127.0.0.1:9092"
send_type = "kafka"
topic = "categraf"
use_compress = false
send_with_tls = false
batch_wait = 5
run_path = "/opt/categraf/run"
open_files_limit = 100
scan_period = 10
frame_size = 9000
collect_container_all = false
[[logs.items]]
type = "file"
path = "/root/works/catpaw/stdout.log"
source = "app"
service = "catpaw"
topic = "catpaw"
accuracy = "s"Start Categraf:
nohup ./categraf &> categraf.log &ClickVisual Log Library Configuration
In ClickVisual’s system management, add a ClickHouse instance (e.g., 127.0.0.1:9000) and create a database db01. Then add a log library whose underlying ClickHouse tables consist of a Kafka engine table, a materialized view, and a result table.
Define the source JSON example (only field names are needed):
{
"message": "x",
"timestamp": 1698829486,
"status": "y"
}ClickVisual infers field types from this sample.
Table Structures
Kafka stream table ( catpaw_stream) created automatically:
CREATE TABLE db01.catpaw_stream (
`status` String,
`timestamp` Float64,
`message` String CODEC(ZSTD(1))
) ENGINE = Kafka SETTINGS kafka_broker_list = '127.0.0.1:9092', kafka_topic_list = 'catpaw', kafka_group_name = 'db01_catpaw', kafka_format = 'JSONEachRow', kafka_num_consumers = 1, kafka_skip_broken_messages = 0;Materialized view ( catpaw_view) transforms the stream into the result table:
CREATE MATERIALIZED VIEW db01.catpaw_view TO db01.catpaw (
`status` String,
`_time_second_` DateTime,
`_time_nanosecond_` DateTime64(9),
`_raw_log_` String
) AS SELECT
status,
toDateTime(toInt64(timestamp)) AS _time_second_,
fromUnixTimestamp64Nano(toInt64(timestamp) * 1000000000) AS _time_nanosecond_,
message AS _raw_log_
FROM db01.catpaw_stream
WHERE 1 = 1;Result table ( catpaw) stores the final logs:
CREATE TABLE db01.catpaw (
`status` String,
`_time_second_` DateTime,
`_time_nanosecond_` DateTime64(9),
`_raw_log_` String CODEC(ZSTD(1)),
INDEX idx_raw_log _raw_log_ TYPE tokenbf_v1(30720, 2, 0) GRANULARITY 1
) ENGINE = MergeTree PARTITION BY toYYYYMMDD(_time_second_) ORDER BY _time_second_ TTL toDateTime(_time_second_) + toIntervalDay(1) SETTINGS index_granularity = 8192;Extracting JSON Fields
To enable fast filtering on a JSON field (e.g., level), add a derived column in the materialized view:
..., toNullable(toString(replaceAll(JSONExtractRaw(message, 'level'), '"', ''))) AS levelAfter altering the view, the result table gains a level column, allowing direct queries without LIKE on the raw log.
Querying Logs
Example query to fetch recent error logs:
SELECT status, _time_second_, level, _raw_log_
FROM db01.catpaw
WHERE level = 'error'
ORDER BY _time_second_ DESC
LIMIT 100;The materialized view ensures that the level column is indexed, providing quick response.
Conclusion
The ClickVisual‑ClickHouse stack offers a compact, query‑friendly solution for log storage and visualization. By leveraging Kafka as the transport and Categraf as a lightweight collector, the pipeline remains simple yet powerful. Users can further customize field extraction and indexing to suit their observability needs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
