Operations 13 min read

Investigation and Optimization of Ceph Slow‑Request Issues Related to Scrub in the Luminous Release

This article analyzes a frequent slow‑request problem in a Ceph cluster caused by heavy write‑induced RocksDB compaction combined with deep‑scrub activity, explains the underlying scrub mechanisms, and presents parameter‑tuning and custom scheduling strategies to mitigate performance impact while preserving data consistency.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Investigation and Optimization of Ceph Slow‑Request Issues Related to Scrub in the Luminous Release

Background Ceph is an open‑source distributed storage system offering block (RBD), object (RadosGW) and file (CephFS) services, widely used in private‑cloud environments. In JD Digital’s infrastructure, Ceph supports core storage needs, and as data volume grows, slow‑request alerts become common, especially during deep‑scrub operations.

Slow‑request analysis An alert at 19:13 showed request latencies around 50 seconds. Cluster health was OK, but two PGs were performing deep‑scrub between 23:00 and 06:00. OSD logs revealed intensive RocksDB compaction, indicating heavy write traffic. The root cause was identified as massive user writes triggering RocksDB compaction, which together with deep‑scrub caused I/O contention and request timeouts.

Initial mitigation The immediate fix was to disable deep‑scrub, which stopped the slow‑requests. However, deep‑scrub is essential for data consistency, so a longer‑term solution focuses on controlling its speed and schedule.

Ceph scrub overview Scrub checks for silent data corruption. Two types exist:

Scrub – compares metadata of object replicas.

Deep‑scrub – reads object data and verifies checksums, consuming more I/O and time.

Scrub operates on a per‑PG basis, generating a ScrubMap that is compared across replicas. Deep‑scrub locks objects, affecting write performance.

Scrub parameters Important OSD configuration items include: osd_max_scrubs – maximum concurrent scrubs per OSD. osd_scrub_min_interval, osd_scrub_max_interval – define the expected scrub window. osd_scrub_begin_hour, osd_scrub_end_hour – allowed time range. osd_scrub_load_threshold – CPU load limit for starting scrubs. osd_deep_scrub_interval and osd_deep_scrub_randomize_ratio – decide when a scrub becomes deep‑scrub.

Example code that decides deep‑scrub timing:

scrubber.time_for_deep = ceph_clock_now() >= info.history.last_deep_scrub_stamp + deep_scrub_interval;
deep_coin_flip = (rand() % 100) < cct->_conf->osd_deep_scrub_randomize_ratio * 100;
scrubber.time_for_deep = scrubber.time_for_deep || deep_coin_flip;

Parameter‑tuning recommendations

Set osd_scrub_max_interval long enough (e.g., > 1 month) so scrubs rarely exceed the allowed window.

Adjust osd_scrub_end_hour to finish before peak business hours.

Raise osd_scrub_load_threshold if scrubs are blocked by low CPU usage limits.

Limit shard size via osd_scrub_chunk_min / osd_scrub_chunk_max to reduce impact on bucket‑level objects.

Tune osd_scrub_sleep to spread I/O and lower client perception of latency.

Custom scrub scheduler For critical periods (e.g., shopping festivals), JD Digital disables the built‑in scrub with ceph osd set noscrub; ceph osd set nodeep‑scrub and runs a self‑developed scheduler that:

Connects to the cluster via RADOS and skips execution on special dates.

Checks time windows and maximum concurrent tasks before launching scrubs.

Selects PGs whose last_deep_scrub_stamp is older than one week and whose primary OSD is idle.

Ensures deep‑scrub tasks stay within configured limits, then sleeps before the next cycle.

Overall, the optimization strategy ensures scrubs do not run outside the configured osd_scrub_begin_hour / osd_scrub_end_hour window and minimizes business impact while preserving data integrity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed storageCephScrub
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.