Databases 14 min read

Why ClickHouse Outperforms Elasticsearch for Log Storage and Analytics

This article compares ClickHouse and Elasticsearch for API log storage, detailing development activity, schema handling, query performance, statistical functions, MySQL integration, new features, and practical drawbacks, while providing concrete SQL examples and migration tips.

dbaplus Community
dbaplus Community
dbaplus Community
Why ClickHouse Outperforms Elasticsearch for Log Storage and Analytics

Background and Motivation

In 2018 the author wrote a popular article about ClickHouse; two years later the project remains very active, with over 800 merged PRs in a single month. By contrast, Elasticsearch also sees high activity (1076 merged PRs in the same period). The author uses ClickHouse for the ApiRoad.net API marketplace to store and analyze HTTP request/response logs, emphasizing observability for API services.

Why Choose ClickHouse Over Elasticsearch and MySQL

The main reasons for preferring ClickHouse are:

SQL support with JSON and array types as first‑class citizens.

Ability to balance strict schema enforcement with flexible JSON storage.

Fast SELECT performance and superior storage efficiency (5‑6× better than Elasticsearch, an order of magnitude faster in query speed).

Rich statistical functions (quantileTiming, quantile, etc.) that simplify analytics.

For OLTP workloads the team still uses MySQL, but ClickHouse handles large‑scale log analytics.

SQL Support, JSON, and Arrays

ClickHouse treats JSON and arrays as native types, allowing easy extraction via materialized columns. Functions such as arrayJoin, groupArray, arrayMap, and arrayFilter provide powerful array manipulation. Compared with MySQL (JSON support only in recent versions) and PostgreSQL (limited before version 12), ClickHouse’s JSON handling is more convenient for log data.

Flexible Schema

Both Elasticsearch and ClickHouse accept large JSON blobs. In ClickHouse you can add materialized columns that extract values from JSON for fast filtering on TB‑scale data. The author references an Altinity video comparing JSON vs. table formats for log storage.

Storage and Query Efficiency

ClickHouse’s columnar storage yields higher compression and faster scans than row‑based MySQL. Benchmarks suggest ClickHouse can be 5‑6× more storage‑efficient than Elasticsearch and an order of magnitude faster in query speed, though direct head‑to‑head benchmarks are scarce.

Statistical Functions

Example query to compute median and percentiles for 404 responses:

SELECT count(*) as cnt,
       quantileTiming(0.5)(duration) as duration_median,
       quantileTiming(0.9)(duration) as duration_90th,
       quantileTiming(0.99)(duration) as duration_99th
FROM logs
WHERE status = 404

The quantileTiming function is optimized for time‑series data. ClickHouse also offers linear regression, weighted averages, and many other aggregate functions (see the official reference list).

MySQL‑ClickHouse Integration

Multiple integration methods are listed:

MySQL as an external dictionary.

Binlog replication to ClickHouse.

MySQL table engine without binlog.

MySQL table functions and table engine definitions.

ClickHouse using the MySQL protocol.

These approaches allow copying MySQL tables into ClickHouse without re‑indexing, though performance characteristics need benchmarking.

New Features

ClickHouse now supports external S3‑backed CSV tables and provides an ALTER TABLE … DELETE WHERE … statement for row‑level deletions, though deletions remain expensive and should be used sparingly in production.

ALTER TABLE db.table [ON CLUSTER cluster] DELETE WHERE filter_expr

Drawbacks

Compared with Elasticsearch, ClickHouse lacks a mature GUI; users typically rely on Grafana, Redash, or Metabase for visualization. The ecosystem for data ingestion tools is smaller—Logstash works but has higher memory usage, so the author built a custom Node.js log shipper that batches inserts.

Some function names (e.g., visitParamHas()) feel unintuitive, and the JSON parser, while standards‑compliant, can be slower.

Conclusion

Elasticsearch excels at full‑text search and large‑scale clustering, but ClickHouse offers a simpler SQL interface, strong performance on TB‑scale log data, and flexible schema handling. Both systems are memory‑hungry (ClickHouse ~4 GB, Elasticsearch ~16 GB). The choice depends on whether you prioritize search capabilities or high‑throughput analytical queries.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceAnalyticsSQLElasticsearchJSONClickHouselog storage
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.