Big Data 11 min read

Why ClickHouse Beats Elasticsearch: Performance, Cost, and Deployment Guide

This article compares ClickHouse and Elasticsearch, highlighting ClickHouse's superior write throughput, query speed, and lower server costs, then provides detailed deployment steps for ClickHouse, Zookeeper, Kafka, and FileBeat to build a cost‑effective big‑data analytics platform.

MaGe Linux Operations

Feb 16, 2024

Why ClickHouse Beats Elasticsearch: Performance, Cost, and Deployment Guide

Background

SaaS services will face data security and compliance issues. The company needs a private‑deployment capability to improve competitiveness and build a data system for operational analysis.

Deploying a full big‑data stack would incur high server costs, so a compromise solution was chosen.

Elasticsearch vs ClickHouse

ClickHouse is a high‑performance column‑oriented distributed DBMS. Tests show the following advantages:

Write throughput: 50‑200 MB/s per server, over 600 k records/s, more than 5× Elasticsearch. Rejected writes and latency are rare.

Query speed: data cached in pagecache yields 2‑30 GB/s per server; even without cache, ClickHouse is 5‑30× faster than Elasticsearch.

Cost: ClickHouse’s compression is 1/3‑1/30 of Elasticsearch, reducing disk space and I/O. It also uses less memory and CPU, potentially halving server costs.

Cost Analysis

Based on Alibaba Cloud pricing without discounts.

Environment Deployment

Zookeeper Cluster Deployment

yum install java-1.8.0-openjdk-devel.x86_64
/etc/profile configure environment variables
yum install ntpdate
ntpdate asia.pool.ntp.org

mkdir zookeeper
mkdir ./zookeeper/data
mkdir ./zookeeper/logs

wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.7.1/apache-zookeeper-3.7.1-bin.tar.gz
tar -zvxf apache-zookeeper-3.7.1-bin.tar.gz -C /usr/zookeeper

export ZOOKEEPER_HOME=/usr/zookeeper/apache-zookeeper-3.7.1-bin
export PATH=$ZOOKEEPER_HOME/bin:$PATH

cd $ZOOKEEPER_HOME/conf
vi zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/zookeeper/data
dataLogDir=/usr/zookeeper/logs
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888

echo "1" > /usr/zookeeper/data/myid
echo "2" > /usr/zookeeper/data/myid
echo "3" > /usr/zookeeper/data/myid

cd $ZOOKEEPER_HOME/bin
sh zkServer.sh start

Kafka Cluster Deployment

mkdir -p /usr/kafka
chmod 777 -R /usr/kafka
wget --no-check-certificate https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/3.2.0/kafka_2.12-3.2.0.tgz
tar -zvxf kafka_2.12-3.2.0.tgz -C /usr/kafka

# broker.id, listeners, etc.
broker.id=1
listeners=PLAINTEXT://ip:9092
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dir=/usr/kafka/logs
num.partitions=5
num.recovery.threads.per.data.dir=3
offsets.topic.replication.factor=2
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=zk1:2181,zk2:2181,zk3:2181
zookeeper.connection.timeout.ms=30000
group.initial.rebalance.delay.ms=0

nohup /usr/kafka/kafka_2.12-3.2.0/bin/kafka-server-start.sh /usr/kafka/kafka_2.12-3.2.0/config/server.properties >/usr/kafka/logs/kafka.log 2>&1 &
/usr/kafka/kafka_2.12-3.2.0/bin/kafka-server-stop.sh
$KAFKA_HOME/bin/kafka-topics.sh --list --bootstrap-server ip:9092
$KAFKA_HOME/bin/kafka-console-consumer.sh --bootstrap-server ip:9092 --topic test --from-beginning
$KAFKA_HOME/bin/kafka-topics.sh --create --bootstrap-server ip:9092 --replication-factor 2 --partitions 3 --topic xxx_data

FileBeat Deployment

sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch

# create elastic.repo in /etc/yum.repos.d/
[elastic-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

yum install filebeat
systemctl enable filebeat
chkconfig --add filebeat

# filebeat.yml key settings
keys_under_root: true
output.kafka:
  hosts: ["kafka1:9092", "kafka2:9092", "kafka3:9092"]
  topic: 'xxx_data_clickhouse'
  partition.round_robin:
    reachable_only: false
    required_acks: 1
    compression: gzip
processors:
  - drop_fields:
      fields: ["input","agent","ecs","log","metadata","timestamp"]
      ignore_missing: false

nohup ./filebeat -e -c /etc/filebeat/filebeat.yml > /user/filebeat/filebeat.log &

ClickHouse Deployment

# Check SSE 4.2 support
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

mkdir -p /data/clickhouse
# add hosts entries for clickhouse nodes
10.190.85.92 bigdata-clickhouse-01
10.190.85.93 bigdata-clickhouse-02

# Performance tuning
echo 'performance' | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
echo 0 | tee /proc/sys/vm/overcommit_memory
echo 'never' | tee /sys/kernel/mm/transparent_hugepage/enabled

yum install yum-utils
rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
yum list | grep clickhouse
yum -y install clickhouse-server clickhouse-client

# config.xml: set <level>information</level>
# logs: /var/log/clickhouse-server/clickhouse-server.log
# error log: /var/log/clickhouse-server/clickhouse-server.err.log

clickhouse-server --version
clickhouse-client --password

sudo clickhouse stop
sudo clickhouse start

Conclusion

The deployment involved many pitfalls, especially FileBeat configuration. ClickHouse configuration details will be updated later. Continuous learning and output are essential for building a personal moat, whether as a technical or business expert.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Elasticsearch ClickHouse Data Warehouse

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.