Databases 5 min read

Understanding InfluxDB Retention Policies and Shard Duration: A Deep Dive

This article explains InfluxDB's retention policy components—duration, replication, and shard duration—clarifies the concepts of shards and shard groups, describes default configurations, offers recommendations for shard group duration, and outlines practical considerations for performance and data management.

System Architect Go
System Architect Go
System Architect Go
Understanding InfluxDB Retention Policies and Shard Duration: A Deep Dive

1. Retention Policy Overview

InfluxDB’s retention policy (RP) consists of three parts: DURATION – how long data is kept; REPLICATION – number of copies in a cluster (ineffective for a single node); and SHARD DURATION – the time range that defines a shard group.

2. What Is a Shard?

In the storage hierarchy, a database corresponds to a folder on disk; each RP creates its own sub‑folder. A shard group is a logical construct that contains one or more shards . Each shard maps to a physical directory that stores data for a specific time range, defined by the shard duration.

All points belonging to the same series within a shard group are written to a single .tsm file inside the shard directory.

InfluxDB data hierarchy
InfluxDB data hierarchy

3. Shard Duration and Shard Group Duration

Shard duration and shard group duration are the same concept: they specify the time interval used to split data into separate shards. By default, the shard group duration is derived from the RP duration, as shown in the diagram.

Default mapping of RP duration to shard group duration
Default mapping of RP duration to shard group duration

4. Default RP and Recommended Settings

If no RP is defined, InfluxDB creates an autogen RP with an infinite duration (data never expires) and a default shard group duration of 7 days.

Choosing an appropriate shard group duration depends on workload:

Longer durations store more data in fewer shards, improving overall performance.

Shorter durations increase flexibility, making it easier to drop expired data by removing whole shard groups rather than individual shards.

5. Official Recommendations

For most scenarios the default works well, but high‑throughput or long‑running instances benefit from longer shard group durations. The official guidance suggests the following configuration.

Official recommended shard group duration settings
Official recommended shard group duration settings

6. Practical Considerations

The shard group should cover twice the longest time range queried most frequently.

Each shard group should contain more than 100 000 points.

Each series inside a shard group should contain more than 1 000 points.

Bulk inserting large historical data over a long time range can trigger the creation of many shards simultaneously, leading to performance degradation and memory exhaustion. In such cases temporarily increasing the shard group duration (e.g., to 52 weeks) is advisable.

7. Dynamic RP Management

Retention policies can be adjusted on the fly; dropping an RP removes all data stored under it.

RP management diagram
RP management diagram
Database ArchitectureInfluxDBtime seriesRetention PolicyShard Duration
System Architect Go
Written by

System Architect Go

Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.