Master ClickHouse Write Performance: Proven Optimization Strategies
This comprehensive guide walks through ClickHouse write‑performance optimization, covering hardware choices, system and application‑level tuning, async insert settings, Buffer engine configuration, storage compression, real‑world case studies, monitoring queries, and actionable best‑practice recommendations.
Overall Optimization Approach
Performance Goals
Support high‑concurrency writes
Achieve low write latency
Guarantee data consistency
Minimize resource consumption
Optimization Dimensions
System layer
Hardware selection
OS kernel parameters
Network configuration
Application layer
Parameter tuning
Engine selection
Table schema design
Business layer
Partition strategy
Batch write design
Asynchronous processing
System Architecture Optimization
Hardware Recommendations
CPU: high‑frequency multi‑core, >=32 cores
Memory: minimum 64 GB, recommended 128 GB+, production 256 GB+
Disk: NVMe SSD, e.g., Intel P4510 series or better
Network: 10 GbE NIC
System Parameter Tuning
# Kernel parameters
sysctl -w vm.max_map_count=655360
sysctl -w vm.swappiness=10
sysctl -w vm.dirty_background_ratio=50
sysctl -w vm.dirty_ratio=80
# Network parameters
sysctl -w net.core.somaxconn=65535
sysctl -w net.ipv4.tcp_max_syn_backlog=65535
sysctl -w net.ipv4.tcp_fin_timeout=30
sysctl -w net.ipv4.tcp_keepalive_time=300
# Filesystem parameters
sysctl -w fs.file-max=65535
sysctl -w fs.aio-max-nr=1048576
# File descriptor limit
ulimit -n 65535Concurrent Write Optimization
Async Insert Configuration
<clickhouse>
<!-- Async insert basic config -->
<async_insert>1</async_insert>
<async_insert_threads>16</async_insert_threads>
<async_insert_max_memory_usage>10000000000</async_insert_max_memory_usage>
<!-- Timeout and retry -->
<async_insert_busy_timeout_ms>200</async_insert_busy_timeout_ms>
<async_insert_stale_timeout_ms>1000</async_insert_stale_timeout_ms>
<!-- Concurrency control -->
<max_concurrent_queries>200</max_concurrent_queries>
<max_concurrent_inserts>100</max_concurrent_inserts>
<background_pool_size>32</background_pool_size>
<background_schedule_pool_size>32</background_schedule_pool_size>
<!-- Lock timeout -->
<lock_acquire_timeout>60</lock_acquire_timeout>
</clickhouse>Memory Management
<clickhouse>
<!-- Memory limits for a 128 GB server -->
<max_memory_usage>80000000000</max_memory_usage> <!-- ~80 GB (60‑70% of RAM) -->
<max_memory_usage_for_user>40000000000</max_memory_usage_for_user> <!-- per‑user limit -->
<!-- Background task memory -->
<background_merge_memory_usage>4000000000</background_merge_memory_usage>
<background_fetches_memory_usage>2000000000</background_fetches_memory_usage>
<!-- Write memory control -->
<max_bytes_before_external_sort>1000000000</max_bytes_before_external_sort>
<max_bytes_before_remerge_sort>1000000000</max_bytes_before_remerge_sort>
</clickhouse>Distributed Write Configuration
-- Create local table
CREATE TABLE local_table (
date Date,
id UInt32,
value String,
version UInt32
) ENGINE = ReplacingMergeTree(version)
PARTITION BY (toYYYYMM(date), cityHash64(id) % 10) -- composite partition
ORDER BY (id, date);
-- Create distributed table
CREATE TABLE distributed_table ON CLUSTER 'cluster_name' AS local_table
ENGINE = Distributed('cluster_name', database, local_table, cityHash64(id));Buffer Engine Optimization
Full Buffer Table Configuration
-- Create target table
CREATE TABLE events (
event_date Date,
event_type String,
user_id UInt32,
version UInt32
) ENGINE = ReplacingMergeTree(version)
PARTITION BY (toYYYYMM(event_date), cityHash64(user_id) % 10)
ORDER BY (user_id, event_date);
-- Create Buffer table
CREATE TABLE events_buffer AS events
ENGINE = Buffer(default, events, 16, -- 16 buffers
10, -- min retain 10 s
100, -- max retain 100 s
10000, -- min rows to flush
100000, -- max rows in buffer
10000000, -- min 10 MB to flush
100000000); -- max 100 MBImportant Buffer Engine Notes
Usage limitations:
Delete operations are not supported
Restarting loses unflushed data
Direct queries are discouraged
Best practices:
Prefer batch writes over single rows
Flush manually at regular intervals
Monitor memory usage continuously
Configuration suggestions:
<clickhouse>
<background_buffer_flush_schedule_pool_size>16</background_buffer_flush_schedule_pool_size>
<buffer_flush_schedule_delay_milliseconds>1000</buffer_flush_schedule_delay_milliseconds>
</clickhouse>Parameter Configuration Optimization
Batch Insert Tuning
-- Increase insert block size
SET max_insert_block_size = 1048576;
SET max_block_size = 1048576;
-- Optimize insert thresholds
SET min_insert_block_size_rows = 10000;
SET min_insert_block_size_bytes = 10000000;Merge Settings Tuning
<merge_tree>
<!-- Merge control -->
<max_background_merges>16</max_background_merges>
<merge_max_block_size>8192</merge_max_block_size>
<max_bytes_to_merge_at_max_space_in_pool>161061273600</max_bytes_to_merge_at_max_space_in_pool>
<!-- Merge selection strategy -->
<merge_selecting_sleep_ms>1000</merge_selecting_sleep_ms>
<merge_tree_clear_old_parts_interval_seconds>60</merge_tree_clear_old_parts_interval_seconds>
</merge_tree>Write Queue and WAL Configuration
<clickhouse>
<!-- Queue control -->
<max_queue_size>1000000</max_queue_size>
<queue_max_wait_ms>30000</queue_max_wait_ms>
<!-- WAL settings -->
<max_wal_size>10000000000</max_wal_size>
<wal_dir>/clickhouse/wal</wal_dir>
<wal_cleanup_period_seconds>30</wal_cleanup_period_seconds>
</clickhouse>Storage and Compression Optimization
Compression Settings
<clickhouse>
<!-- Compression configuration -->
<min_compress_block_size>8192</min_compress_block_size>
<max_compress_block_size>1048576</max_compress_block_size>
<compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</compression>
</clickhouse>Advanced Storage Optimization
-- Optimize table storage structure
CREATE TABLE optimized_table (
date Date,
id UInt32,
value String,
version UInt32
) ENGINE = ReplacingMergeTree(version)
PARTITION BY (toYYYYMM(date), cityHash64(id) % 10)
ORDER BY (id, date)
SETTINGS index_granularity = 8192,
enable_mixed_granularity_parts = 1,
min_bytes_for_wide_part = 10000000;Practical Case Studies
Case 1 – Log Ingestion System
Scenario:
- Data volume: 100 k rows/sec
- Concurrency: 100 writer threads
- Row size: ~1 KB
Optimization steps:
1. Use Buffer engine with 16 buffers
2. Batch size 10 k rows
3. Enable async inserts
4. Partition by hour
Results:
- QPS ↑ from 100 k to 300 k
- Latency ↓ 70%
- CPU usage ↓ 40%
- Memory usage more stableCase 2 – Real‑time Analytics
Scenario:
- Minute‑level data updates
- Multi‑dimensional aggregations
- High write‑read concurrency
Optimization steps:
1. Deploy distributed tables
2. Composite partition strategy
3. Use ReplacingMergeTree
4. Enable async inserts
5. Create materialized pre‑aggregation views
Results:
- Write latency < 100 ms
- Query performance ↑ 5×
- Resource utilization more balancedMonitoring and Operations
Core Monitoring Metrics
-- Write performance metrics
SELECT * FROM system.metrics WHERE metric LIKE '%Insert%';
-- View write queue status
SELECT * FROM system.metrics WHERE metric LIKE '%Queue%';
-- Monitor merge activity
SELECT * FROM system.merges;
-- Partition health
SELECT table, partition, active_parts FROM system.parts WHERE active = 1;
-- Error monitoring
SELECT * FROM system.errors WHERE error LIKE '%Insert%';Performance Diagnosis
-- Slow insert query analysis
SELECT query, query_duration_ms, read_rows, written_rows
FROM system.query_log
WHERE type = 'InsertQuery' AND query_duration_ms > 1000
ORDER BY query_duration_ms DESC
LIMIT 10;
-- Resource usage analysis
SELECT metric, value FROM system.metrics
WHERE metric LIKE '%Memory%' OR metric LIKE '%CPU%';Operations Experience Summary
Regular maintenance:
Check parts count
Monitor merge status
Clean up expired data
Optimize table schema
Common issue handling:
Adopt distributed tables
Refine partition strategy
Adjust lock timeouts
Tune memory limits
Fine‑tune Buffer settings
Validate batch size and insert_block_size
Inspect background merges
Analyze system resource pressure
Address high write latency, excessive memory use, concurrency conflicts
Optimization recommendations:
Design sensible partitions
Prefer batch writes over single rows
Make async inserts the default path
Continuously monitor key metrics
Best‑Practice Summary
Core Configuration Pillars
Hardware selection
SSD storage
Ample RAM
Multi‑core CPU
Parameter tuning
Memory management
Concurrency control
Compression optimization
Architecture design
Distributed deployment
Buffer engine usage
Async write pipeline
Usage Recommendations
Write strategy
Batch inserts
Appropriate partitioning
Asynchronous processing
Monitoring & maintenance
Track core metrics
Perform performance analysis
Schedule regular optimizations
Precautions
Maintain data consistency
Balance resource consumption
Consider cost‑benefit trade‑offs
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
