Performance and Feature Comparison between Elasticsearch and ClickHouse for Log Analytics
This article compares Elasticsearch and ClickHouse in terms of architecture, query capabilities, and performance for log analytics, presenting test setups, Docker‑compose configurations, query examples, and benchmark results that show ClickHouse generally outperforms Elasticsearch in most basic query scenarios.
Elasticsearch is a real‑time distributed search and analytics engine built on Lucene, often used together with Logstash and Kibana (the ELK stack) for log processing. ClickHouse, developed by Yandex, is a column‑oriented relational database designed for OLAP workloads and has become a popular alternative for large‑scale log analytics.
Architecture and Design Comparison
Elasticsearch relies on inverted indexes and Bloom filters to support fast full‑text search, using a shard‑and‑replica model for scalability and high availability. Its node roles include client, data, and master nodes.
ClickHouse follows an MPP architecture where each node processes a portion of the data independently. It stores data column‑wise, uses vectorized execution, log‑structured merge trees, sparse indexes, and Zookeeper for coordination, and also supports Bloom filters for search.
Test Setup
A Docker‑Compose environment was created with four stacks: an Elasticsearch stack (single‑node Elasticsearch container and Kibana), a ClickHouse stack (single‑node ClickHouse container and TabixUI client), a data‑ingestion stack using Vector.dev, and a test‑control stack using Jupyter notebooks and the Python SDKs for both systems.
Elasticsearch stack deployment:
version: '3.7'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.4.0
container_name: elasticsearch
environment:
- xpack.security.enabled=false
- discovery.type=single-node
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
cap_add:
- IPC_LOCK
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
deploy:
resources:
limits:
cpus: '4'
memory: 4096M
reservations:
memory: 4096M
kibana:
container_name: kibana
image: docker.elastic.co/kibana/kibana:7.4.0
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
ports:
- 5601:5601
depends_on:
- elasticsearch
volumes:
elasticsearch-data:
driver: localClickHouse stack deployment:
version: "3.7"
services:
clickhouse:
container_name: clickhouse
image: yandex/clickhouse-server
volumes:
- ./data/config:/var/lib/clickhouse
ports:
- "8123:8123"
- "9000:9000"
- "9009:9009"
- "9004:9004"
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "localhost:8123/ping"]
interval: 30s
timeout: 5s
retries: 3
deploy:
resources:
limits:
cpus: '4'
memory: 4096M
reservations:
memory: 4096M
tabixui:
container_name: tabixui
image: spoonest/clickhouse-tabix-web-client
environment:
- CH_NAME=dev
- CH_HOST=127.0.0.1:8123
- CH_LOGIN=default
ports:
- "18080:80"
depends_on:
- clickhouse
deploy:
resources:
limits:
cpus: '0.1'
memory: 128M
reservations:
memory: 128MData ingestion uses Vector.dev to generate synthetic syslog data (100,000 records) and send it simultaneously to both Elasticsearch and ClickHouse. The table creation in ClickHouse is performed with the following SQL:
CREATE TABLE default.syslog(
application String,
hostname String,
message String,
mid String,
pid String,
priority Int16,
raw String,
timestamp DateTime('UTC'),
version Int16
) ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY timestamp
TTL timestamp + toIntervalMonth(1);Vector configuration (vector.toml) defines sources, transforms, and sinks to route data to both back‑ends, handling field extraction, type coercion, and output formatting.
[sources.in]
type = "generator"
format = "syslog"
interval = 0.01
count = 100000
[transforms.clone_message]
type = "add_fields"
inputs = ["in"]
fields.raw = "{{ message }}"
[transforms.parser]
type = "regex_parser"
inputs = ["clone_message"]
field = "message"
patterns = ['^<(?P
\d*)>(?P
\d) (?P
\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}Z) (?P
\w+\.\w+) (?P
\w+) (?P
\d+) (?P
ID\d+) - (?P
.*)$']
[transforms.coercer]
type = "coercer"
inputs = ["parser"]
types.timestamp = "timestamp"
types.version = "int"
types.priority = "int"
[sinks.out_console]
type = "console"
inputs = ["coercer"]
target = "stdout"
encoding.codec = "json"
[sinks.out_clickhouse]
host = "http://host.docker.internal:8123"
inputs = ["coercer"]
table = "syslog"
type = "clickhouse"
encoding.only_fields = ["application","hostname","message","mid","pid","priority","raw","timestamp","version"]
encoding.timestamp_format = "unix"
[sinks.out_es]
type = "elasticsearch"
inputs = ["coercer"]
compression = "none"
endpoint = "http://host.docker.internal:9200"
index = "syslog-%F"
healthcheck.enabled = trueQuery Comparison
Both systems were queried using equivalent requests for common operations such as match‑all, single‑field match, multi‑field match, term search, range queries, existence checks, regex searches, and aggregations. Example queries include:
# ES match_all
{ "query": { "match_all": {} } }
# ClickHouse match_all
SELECT * FROM syslog; # ES term query
{ "query": { "term": { "message": "pretty" } } }
# ClickHouse term query
SELECT * FROM syslog WHERE lowerUTF8(raw) LIKE '%pretty%';Performance tests were run ten times per query using the Python SDKs, and response time distributions were recorded.
Results
The benchmark shows ClickHouse consistently achieving lower latency than Elasticsearch across most query types, including regex and term queries. Aggregation queries especially benefit from ClickHouse’s columnar storage, delivering significantly faster results.
Even without tuning (e.g., Bloom filters disabled), ClickHouse demonstrated superior performance, indicating its suitability for many log‑search scenarios, while Elasticsearch still offers richer query features for more complex use cases.
Conclusion
The comparative study reveals that ClickHouse outperforms Elasticsearch in basic log‑analytics queries, both in functionality and speed, explaining why many companies are migrating from Elasticsearch to ClickHouse for such workloads.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.