Big Data 13 min read

Why ClickHouse Beats Elasticsearch for Log Analytics – Performance, Cost & Deployment

This article compares ClickHouse and Elasticsearch for log analytics, highlighting ClickHouse’s superior write throughput, query speed, and lower server costs, then details a cost‑effective deployment architecture—including Zookeeper, Kafka, FileBeat, and ClickHouse setup—and shares optimization tips and visualization using ClickVisual.

Efficient Ops
Efficient Ops
Efficient Ops
Why ClickHouse Beats Elasticsearch for Log Analytics – Performance, Cost & Deployment

Background

SaaS services face data security and compliance challenges. To improve competitiveness, the company needs a private deployment capability and a data system for operational analysis without incurring large server costs, leading to a compromise solution.

Elasticsearch vs ClickHouse

ClickHouse is a high‑performance columnar distributed DBMS. Tests show:

Write throughput: 50‑200 MB/s per server, over 600 k records/s, >5× Elasticsearch.

Query speed: 2‑30 GB/s from page cache, 5‑30× faster than Elasticsearch.

Lower server cost: higher compression (1/3‑1/30 disk space) and lower memory/CPU usage, potentially halving server costs.

Cost Analysis

Cost estimate based on Alibaba Cloud without discounts.

Environment Deployment

Zookeeper Cluster

<code>yum install java-1.8.0-openjdk-devel.x86_64
# configure /etc/profile, set timezone, create directories, download and extract Zookeeper
export ZOOKEEPER_HOME=/usr/zookeeper/apache-zookeeper-3.7.1-bin
export PATH=$ZOOKEEPER_HOME/bin:$PATH
cd $ZOOKEEPER_HOME/conf
vi zoo.cfg
# sample zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/zookeeper/data
dataLogDir=/usr/zookeeper/logs
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888
# create myid on each node
echo "1" > /usr/zookeeper/data/myid
# start
sh zkServer.sh start</code>

Kafka Cluster

<code>mkdir -p /usr/kafka
chmod 777 -R /usr/kafka
wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/3.2.0/kafka_2.12-3.2.0.tgz
tar -zvxf kafka_2.12-3.2.0.tgz -C /usr/kafka
# broker configuration (example)
broker.id=1
listeners=PLAINTEXT://ip:9092
# other settings …
nohup /usr/kafka/kafka_2.12-3.2.0/bin/kafka-server-start.sh /usr/kafka/kafka_2.12-3.2.0/config/server.properties >/usr/kafka/logs/kafka.log 2>&1 &amp;</code>

FileBeat Deployment

<code>sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
# create elastic.repo in /etc/yum.repos.d/
[elastic-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
yum install filebeat
systemctl enable filebeat</code>

FileBeat configuration highlights: set

keys_under_root: true

and define Kafka output.

ClickHouse Deployment

<code># check SSE4.2 support
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
mkdir -p /data/clickhouse
# add hosts entries for clickhouse nodes
# set CPU governor to performance
echo 'performance' | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# disable overcommit
echo 0 | tee /proc/sys/vm/overcommit_memory
# disable transparent huge pages
echo 'never' | tee /sys/kernel/mm/transparent_hugepage/enabled
# install from official repo
yum install yum-utils
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
yum -y install clickhouse-server clickhouse-client
# adjust config.xml log level to information</code>

Visualization with ClickVisual

ClickVisual is an open‑source lightweight log query, analysis, and alerting UI that supports ClickHouse as a backend, offering histogram panels, index management, proxy authentication, and real‑time alerts.

Optimization Methods

Log Query Optimization

TraceID scenario: use tokenbf_v1 index with

hasToken

for fast hits.

Unstructured logs: replace

LIKE

with inverted index support in newer ClickHouse versions.

Aggregation: leverage ClickHouse

Projection

feature.

Local vs Distributed Tables

For high‑frequency log writes, prefer local tables to avoid network overhead, part explosion, and Zookeeper pressure.

ClickHouse Limits

Set user‑level limits in

users.xml

to prevent runaway queries, e.g.,

max_memory_usage

,

max_rows_to_read

,

max_result_rows

,

max_bytes_to_read

.

Conclusion

The deployment involved many pitfalls, especially FileBeat configuration. Future posts will detail additional ClickHouse tuning experiences.

Big DatadeploymentelasticsearchKafkacost optimizationClickHouseLog Analytics
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.