Databases 9 min read

Achieving Sub‑Second Queries on 1.2 B‑Row PostgreSQL Using BRIN, pg_cron & Query Folding

The article recounts how a PostgreSQL instance on a modest 2‑CPU, 4 GB VM handling 1.2 billion rows was dramatically accelerated by adding BRIN indexes, scheduling maintenance with pg_cron, applying query folding and tuning memory and parallel settings, achieving sub‑second query times without additional hardware.

DevOps Coach
DevOps Coach
DevOps Coach
Achieving Sub‑Second Queries on 1.2 B‑Row PostgreSQL Using BRIN, pg_cron & Query Folding

Background: A PostgreSQL server running on a tiny 2‑core, 4 GB virtual machine was tasked with analyzing a table containing more than 1.2 billion rows. Initial performance was poor – a simple COUNT(*) took 27 seconds and complex joins exceeded one minute.

Why hardware upgrade isn’t the only answer

The team’s instinct was to provision a larger server, but the author demonstrated that PostgreSQL’s built‑in features can unlock massive speedups on the existing hardware.

Using BRIN indexes

Traditional B‑Tree indexes become bulky and memory‑hungry on tables with hundreds of millions of rows. BRIN (Block Range Index) stores a compact summary for each disk block, using only a few kilobytes instead of gigabytes.

CREATE INDEX idx_logs_brin_ts ON logs USING brin(timestamp);

This single statement reduced the index size from 24 GB to 32 MB. A count query on the timestamp range dropped from 11.8 seconds to 0.9 seconds, and after further tuning stabilized around 0.7 seconds.

Automating maintenance with pg_cron

Out‑of‑date statistics and bloated tables hurt the planner. Installing the built‑in scheduler is straightforward:

sudo apt install postgresql-15-cron
CREATE EXTENSION pg_cron;

Nightly jobs keep the table lean and statistics fresh:

SELECT cron.schedule('vacuum_logs', '0 2 * * *', 'VACUUM ANALYZE logs');
SELECT cron.schedule('repack_logs', '0 3 * * 0', 'REINDEX TABLE logs;');

Query folding for earlier filtering

Even with indexes, the planner sometimes chooses a sequential scan. By rewriting the query to filter early, performance improves dramatically.

SELECT date_trunc('day', l.timestamp), COUNT(*)
FROM logs l
JOIN users u ON l.user_id = u.id
WHERE u.country = 'US'
  AND l.timestamp >= now() - interval '30 days'
GROUP BY 1;

was replaced with:

WITH filtered_users AS (
  SELECT id FROM users WHERE country = 'US'
)
SELECT date_trunc('day', l.timestamp), COUNT(*)
FROM logs l
WHERE l.user_id IN (SELECT id FROM filtered_users)
  AND l.timestamp >= now() - interval '30 days'
GROUP BY 1;

The execution time fell from 4.5 seconds to 0.73 seconds on the same hardware.

Memory and parallelism tuning

shared_buffers = 1GB
work_mem = 64MB
maintenance_work_mem = 256MB
effective_cache_size = 3GB
max_parallel_workers_per_gather = 2

These settings give the planner accurate cost estimates and enable multi‑core parallel scans.

Native declarative partitioning

Partitioning the logs table by month lets PostgreSQL prune irrelevant partitions automatically.

CREATE TABLE logs (
  id BIGSERIAL,
  timestamp TIMESTAMP,
  user_id BIGINT,
  message TEXT
) PARTITION BY RANGE (timestamp);

CREATE TABLE logs_2025_01 PARTITION OF logs
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

Monthly queries became five times faster.

Performance results

Daily aggregation: 4.5 s → 0.73 s

Range query: 11.8 s → 0.9 s

Full‑table count: 27 s → 3.2 s

Disk usage: 89 GB → 42 GB

Trade‑offs and cautions

BRIN indexes excel on naturally ordered, append‑only data but perform poorly on frequently updated tables.

pg_cron tasks should be spaced to avoid I/O contention.

Setting work_mem too high on a small server can cause OOM.

Always run ANALYZE after bulk loads or partition changes.

Key takeaways

Use BRIN indexes for cold, ordered data.

Leverage pg_cron for proactive maintenance.

Apply query folding to help the planner filter early.

Tune memory parameters and enable parallel workers.

Adopt declarative partitioning to reduce scanned data.

Result: a PostgreSQL database with 1.2 billion rows delivering sub‑second query responses on a 2‑core, 4 GB machine – proof that careful tuning can rival distributed solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance tuningPostgreSQLPartitioningBRIN Indexpg_cronQuery Folding
DevOps Coach
Written by

DevOps Coach

Master DevOps precisely and progressively.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.