Why ClickHouse Beats Elasticsearch for Log Analytics – Performance, Cost & Deployment
This article compares ClickHouse and Elasticsearch for log analytics, highlighting ClickHouse’s superior write throughput, query speed, and lower server costs, then details a cost‑effective deployment architecture—including Zookeeper, Kafka, FileBeat, and ClickHouse setup—and shares optimization tips and visualization using ClickVisual.
Background
SaaS services face data security and compliance challenges. To improve competitiveness, the company needs a private deployment capability and a data system for operational analysis without incurring large server costs, leading to a compromise solution.
Elasticsearch vs ClickHouse
ClickHouse is a high‑performance columnar distributed DBMS. Tests show:
Write throughput: 50‑200 MB/s per server, over 600 k records/s, >5× Elasticsearch.
Query speed: 2‑30 GB/s from page cache, 5‑30× faster than Elasticsearch.
Lower server cost: higher compression (1/3‑1/30 disk space) and lower memory/CPU usage, potentially halving server costs.
Cost Analysis
Cost estimate based on Alibaba Cloud without discounts.
Environment Deployment
Zookeeper Cluster
<code>yum install java-1.8.0-openjdk-devel.x86_64
# configure /etc/profile, set timezone, create directories, download and extract Zookeeper
export ZOOKEEPER_HOME=/usr/zookeeper/apache-zookeeper-3.7.1-bin
export PATH=$ZOOKEEPER_HOME/bin:$PATH
cd $ZOOKEEPER_HOME/conf
vi zoo.cfg
# sample zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/zookeeper/data
dataLogDir=/usr/zookeeper/logs
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888
# create myid on each node
echo "1" > /usr/zookeeper/data/myid
# start
sh zkServer.sh start</code>Kafka Cluster
<code>mkdir -p /usr/kafka
chmod 777 -R /usr/kafka
wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/3.2.0/kafka_2.12-3.2.0.tgz
tar -zvxf kafka_2.12-3.2.0.tgz -C /usr/kafka
# broker configuration (example)
broker.id=1
listeners=PLAINTEXT://ip:9092
# other settings …
nohup /usr/kafka/kafka_2.12-3.2.0/bin/kafka-server-start.sh /usr/kafka/kafka_2.12-3.2.0/config/server.properties >/usr/kafka/logs/kafka.log 2>&1 &</code>FileBeat Deployment
<code>sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
# create elastic.repo in /etc/yum.repos.d/
[elastic-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
yum install filebeat
systemctl enable filebeat</code>FileBeat configuration highlights: set
keys_under_root: trueand define Kafka output.
ClickHouse Deployment
<code># check SSE4.2 support
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
mkdir -p /data/clickhouse
# add hosts entries for clickhouse nodes
# set CPU governor to performance
echo 'performance' | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# disable overcommit
echo 0 | tee /proc/sys/vm/overcommit_memory
# disable transparent huge pages
echo 'never' | tee /sys/kernel/mm/transparent_hugepage/enabled
# install from official repo
yum install yum-utils
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
yum -y install clickhouse-server clickhouse-client
# adjust config.xml log level to information</code>Visualization with ClickVisual
ClickVisual is an open‑source lightweight log query, analysis, and alerting UI that supports ClickHouse as a backend, offering histogram panels, index management, proxy authentication, and real‑time alerts.
Optimization Methods
Log Query Optimization
TraceID scenario: use tokenbf_v1 index with
hasTokenfor fast hits.
Unstructured logs: replace
LIKEwith inverted index support in newer ClickHouse versions.
Aggregation: leverage ClickHouse
Projectionfeature.
Local vs Distributed Tables
For high‑frequency log writes, prefer local tables to avoid network overhead, part explosion, and Zookeeper pressure.
ClickHouse Limits
Set user‑level limits in
users.xmlto prevent runaway queries, e.g.,
max_memory_usage,
max_rows_to_read,
max_result_rows,
max_bytes_to_read.
Conclusion
The deployment involved many pitfalls, especially FileBeat configuration. Future posts will detail additional ClickHouse tuning experiences.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.