Databases 17 min read

Mastering Billion-Row Time-Series Data in SQL Server: Bulk Insert, Partitioning, Index Tuning

To meet a demanding monitoring project that required storing up to 400 million records per day in SQL Server 2012, the author details a step‑by‑step journey involving bulk‑copy insertion, removing indexes during load, hourly partitioned tables, strategic index creation, and query optimizations to achieve sub‑second query times.

ITPUB

Sep 20, 2016

Mastering Billion-Row Time-Series Data in SQL Server: Bulk Insert, Partitioning, Index Tuning

Project Background

The author, a programmer rather than a DBA, describes a high‑pressure monitoring project for a data center that demanded real‑time storage of massive telemetry data.

Requirements

The system had to support at least 100,000 monitoring metrics, with each metric updated at most every 20 seconds and storage latency under 120 seconds. This translates to roughly 30 million rows per minute, 1.8 billion rows per hour, and about 430 million rows per day (plus ~5 % overhead).

Hardware & SQL Server Version

CPU: Intel Xeon E5‑2609 (4 cores, 2.40 GHz)

Memory: 4 GB DDR3 ECC

Disk: 500 GB 7200 RPM SATA3, RAID‑5

Database: SQL Server 2012 Standard Edition

Initial Write Bottleneck

Using the original program, SQL Server could not keep up with the ingestion rate; CPU usage stayed above 80 % and the system ran out of memory because data accumulated faster than it could be written.

Original Storage Structure

CREATE TABLE [dbo].[His20140822] (
    [No] bigint IDENTITY(1,1) NOT NULL,
    [Dtime] datetime NOT NULL,
    [MgrObjId] varchar(36) NOT NULL,
    [Id] varchar(50) NOT NULL,
    [Value] varchar(50) NOT NULL,
    CONSTRAINT [PK_His20140822] PRIMARY KEY CLUSTERED ([No] ASC)
) WITH (
    PAD_INDEX = OFF,
    STATISTICS_NORECOMPUTE = OFF,
    IGNORE_DUP_KEY = OFF,
    ALLOW_ROW_LOCKS = ON,
    ALLOW_PAGE_LOCKS = ON
) ON [PRIMARY];

The table stored a row for every metric value, resulting in billions of rows per day.

Bulk Insert Implementation

public static int BatchInert(string connectionString, string desTable, DataTable dt, int batchSize = 500)
{
    using (var sbc = new SqlBulkCopy(connectionString, SqlBulkCopyOptions.UseInternalTransaction))
    {
        sbc.BulkCopyTimeout = 300;
        sbc.NotifyAfter = dt.Rows.Count;
        sbc.BatchSize = batchSize;
        sbc.DestinationTableName = desTable;
        foreach (DataColumn column in dt.Columns)
            sbc.ColumnMappings.Add(column.ColumnName, column.ColumnName);
        sbc.WriteToServer(dt);
    }
    return dt.Rows.Count;
}

BulkCopy was expected to write millions of rows per second, but performance remained far below the target.

Problem Diagnosis

Memory overflow occurred because incoming data piled up faster than it could be persisted. The team performed unit‑level timing tests to locate the slowest steps.

Optimization Attempts

Adjusted BulkCopyTimeout and BatchSize – no significant gain.

Stored metric values as XML per device to reduce row count – modest improvement.

Considered table partitioning but lacked time to learn the technique.

Stopped unrelated services – minor effect.

Investigated whether SQL Server I/O was the bottleneck.

Removing Indexes

All non‑clustered indexes on MgrObjId and Id were dropped. After the change, inserting 100,000 rows took 7–9 seconds, meeting the 20‑second requirement.

Query Challenges

With a single table holding >4 × 10⁸ rows per day, queries without indexes were impractically slow.

Hourly Partitioning Solution

The team switched from daily to hourly tables, creating 24 tables per day (e.g., His_001_2014112615). This reduced each table to ~18 million rows, enabling acceptable query performance.

Query Optimization Experiments

The team tried reordering WHERE clauses, clearing caches with DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS, and measuring I/O statistics. The order of predicates had negligible impact without indexes.

-- Before optimization
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SET STATISTICS IO ON;
SELECT Dtime, Value FROM dbo.his20140825
WHERE Dtime >= '' AND Dtime <= '' AND MgrObjId = '' AND Id = '';
SET STATISTICS IO OFF;

-- After optimization
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SET STATISTICS IO ON;
SELECT Dtime, Value FROM dbo.his20140825
WHERE MgrObjId = '' AND Id = '' AND Dtime >= '' AND Dtime <= '';
SET STATISTICS IO OFF;

Index Experiments

Created a non‑clustered index on MgrObjId alone – slower.

Created a composite index (MgrObjId, Id, Dtime) – query speed doubled.

Final optimal index:

CREATE NONCLUSTERED INDEX Idx_His20141008 ON dbo.his20141008(MgrObjId, Id) INCLUDE (Value, Dtime);

– query completed in under one second for 11 million rows.

Further Optimizations

Suggested read/write separation: a real‑time database for the most recent hour and a read‑only database for older data, with periodic index rebuilds. If physical partitioning is not used, rebuilding indexes on the read‑only database suffices.

Final Recommendations

Drop all indexes before bulk loading.

Use SqlBulkCopy for fast insertion.

Partition or shard tables to keep each table size manageable.

Re‑create indexes after the load completes.

Choose index columns wisely; include frequently returned columns.

Only return needed columns in queries.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Large Data SQL Server Index Tuning Bulk Insert Time-Series

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.