How to Handle Billions of Rows in SQL Server: Bulk Insert, Partitioning, and Index Tuning
This article walks through a real‑world monitoring project that must store over 400 million rows per day in SQL Server 2012, detailing the performance bottlenecks, bulk‑copy tuning, schema redesign, partitioning strategies, index removal and creation, and query‑optimisation techniques that finally achieve sub‑second query times and meet strict latency requirements.
Project Requirements and Environment
The monitoring system must store at least 100 000 metrics, each updating no slower than every 20 seconds, with a storage latency under 120 seconds. This results in roughly 30 million rows per minute, 1.8 billion rows per hour, and about 4.3 billion rows per day (≈5 % overhead).
All services run on a single server (SQL Server 2012 Standard, Intel Xeon E5‑2609 4‑core 2.40 GHz, 4 GB DDR3 ECC RAM, 500 GB 7200 RPM RAID‑5). CPU usage often exceeds 80 % before optimisation.
Initial Write Bottleneck
Original table schema stored each metric as a separate row:
CREATE TABLE [dbo].[His20140822](
[No] BIGINT IDENTITY(1,1) NOT NULL,
[Dtime] DATETIME NOT NULL,
[MgrObjId] VARCHAR(36) NOT NULL,
[Id] VARCHAR(50) NOT NULL,
[Value] VARCHAR(50) NOT NULL,
CONSTRAINT [PK_His20140822] PRIMARY KEY CLUSTERED ([No] ASC)
) ON [PRIMARY];Bulk insertion was performed with SqlBulkCopy:
public static int BatchInsert(string connectionString, string destTable, DataTable dt, int batchSize = 500)
{
using (var sbc = new SqlBulkCopy(connectionString, SqlBulkCopyOptions.UseInternalTransaction))
{
sbc.BulkCopyTimeout = 300;
sbc.NotifyAfter = dt.Rows.Count;
sbc.BatchSize = batchSize;
sbc.DestinationTableName = destTable;
foreach (DataColumn column in dt.Columns)
{
sbc.ColumnMappings.Add(column.ColumnName, column.ColumnName);
}
sbc.WriteToServer(dt);
}
return dt.Rows.Count;
}Even after tuning BulkCopyTimeout and BatchSize, insertion of 10 k–20 k rows took ~5 seconds, far short of the required 20 seconds for 200 k rows.
Schema Redesign Attempts
Storing metric values as XML per device reduced row count but did not close the performance gap. Removing indexes on MgrObjId and Id dramatically improved bulk load speed: 100 k rows inserted in 7–9 seconds, meeting the ingestion target.
Query Challenges Without Indexes
With billions of rows, queries without indexes become impractically slow. The team partitioned data by hour, creating 24 tables per day (≈18 million rows each). Further partitioning by collector device produced up to 240 tables per day, keeping each table size manageable.
Index Creation Experiments
Single‑column index on MgrObjId : 550 MB, 5 min 25 s build time, but query performance degraded.
Multi‑column non‑clustered index (MgrObjId, Id, Dtime) : doubled query speed, index size 1.1 GB, build time 7 min 25 s.
Non‑clustered index with INCLUDE :
CREATE NONCLUSTERED INDEX Idx_His20141008
ON dbo.his20141008 (MgrObjId, Id)
INCLUDE (Value, Dtime);Built in ~6 min, size 903 MB, and returned results in under 1 second for 11 million rows.
Practical Index Application
Data for the most recent hour is loaded without indexes. After the hour’s bulk load completes, the appropriate index is created. This avoids index maintenance overhead during high‑throughput inserts.
Further Optimisation Strategies
Read/write separation: a real‑time database for the latest hour and a read‑only database for older data.
Periodic index rebuilds on the read‑only database.
If physical partitioning is undesirable, schedule regular index maintenance instead.
Key Recommendations
Drop all non‑essential indexes before bulk loading.
Use SqlBulkCopy for high‑speed inserts.
Partition tables (by hour or by collector) to keep each table size reasonable.
Build indexes only after a partition’s data load is finished.
Select highly selective columns for the index key and place frequently returned columns in the INCLUDE list.
Return only required columns in queries to minimise I/O.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
