How We Achieved 50,000 TPS with PostgreSQL & Node.js: 6 Proven Optimizations
By systematically applying connection pooling, batch processing, WAL tuning, table partitioning, query refactoring, and OS-level tweaks, we transformed a PostgreSQL‑Node.js stack to sustain 50,000 transactions per second with sub‑100 ms latency, while maintaining data consistency on modest hardware.
Challenges
We needed to ingest tens of thousands of events per second, keep query latency below 100 ms, guarantee data consistency, and run on a budget‑constrained server.
Process >50k events per second
Maintain sub‑100 ms query response
Ensure durability and consistency
Stay within reasonable hardware limits
Initial Simple Implementation
The first version handled each incoming event with a separate transaction.
app.post('/event', async (req, res) => {
const client = await pool.connect();
try {
await client.query('BEGIN');
await client.query('INSERT INTO events(data) VALUES($1)', [req.body]);
await client.query('COMMIT');
res.status(200).send('事件已记录');
} catch (e) {
await client.query('ROLLBACK');
res.status(500).send('记录事件时出错');
} finally {
client.release();
}
});This approach capped out at roughly 2,000 TPS due to connection overhead and resource contention.
Optimization #1: Proper Connection Pooling
We introduced pg-pool to reuse connections efficiently.
const { Pool } = require('pg');
const pool = new Pool({
max: 100, // tuned to CPU cores
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
app.post('/event', async (req, res) => {
const client = await pool.connect();
try {
await client.query('INSERT INTO events(data) VALUES($1)', [req.body]);
res.status(200).send('事件已记录');
} catch (e) {
console.error('记录事件时出错:', e);
res.status(500).send('记录事件时出错');
} finally {
client.release();
}
});This change raised throughput to about 4,000 TPS.
Optimization #2: Batch Processing
We aggregated events in memory and wrote them in bulk, reducing round‑trips.
let eventQueue = [];
const BATCH_SIZE = 1000;
const MAX_BATCH_WAIT_MS = 50;
app.post('/event', (req, res) => {
eventQueue.push(req.body);
res.status(202).send('事件已加入队列');
});
async function processBatch() {
if (eventQueue.length === 0) return;
const batch = eventQueue.splice(0, BATCH_SIZE);
const client = await pool.connect();
try {
await client.query('BEGIN');
const query = 'INSERT INTO events(data) VALUES ' +
batch.map((_, i) => `($${i + 1})`).join(',');
await client.query(query, batch);
await client.query('COMMIT');
} catch (e) {
await client.query('ROLLBACK');
console.error('批量处理错误:', e);
} finally {
client.release();
}
}
setInterval(processBatch, MAX_BATCH_WAIT_MS);Throughput jumped to roughly 15,000 TPS.
Optimization #3: WAL Tuning
We adjusted PostgreSQL’s write‑ahead log settings to favor speed.
# postgresql.conf
wal_level = replica
fsync = on
synchronous_commit = off
wal_writer_delay = 10ms
wal_buffers = 16MB
checkpoint_timeout = 15min
max_wal_size = 4GBDisabling synchronous_commit boosted throughput to about 25,000 TPS, with a small risk of data loss on crash, mitigated by replication.
Optimization #4: Table Partitioning
We partitioned the events table by timestamp, allowing PostgreSQL to prune irrelevant partitions.
CREATE TABLE events (
id SERIAL,
timestamp TIMESTAMPTZ DEFAULT NOW(),
data JSONB
) PARTITION BY RANGE (timestamp);
-- Example daily partition
CREATE TABLE events_y2023_m09_d01 PARTITION OF events
FOR VALUES FROM ('2023-09-01') TO ('2023-09-02');
-- Trigger function to create future partitions (simplified)
CREATE OR REPLACE FUNCTION create_partition_if_not_exists()
RETURNS trigger AS $$
BEGIN
-- implementation omitted
RETURN NULL;
END;
$$ LANGUAGE plpgsql;This improved both insert speed and query latency.
Optimization #5: Eliminating N+1 Queries
We rewrote the metadata fetch to use a single join with JSON aggregation.
async function getEventsWithMetadata(userId) {
const result = await pool.query(`
SELECT e.*, json_agg(m.*) AS metadata
FROM events e
LEFT JOIN metadata m ON e.id = m.event_id
WHERE e.user_id = $1
GROUP BY e.id
`, [userId]);
return result.rows;
}Response times dropped from seconds to milliseconds.
Optimization #6: Hardware & OS Tweaks
We tuned Linux kernel parameters and PostgreSQL memory settings.
# /etc/sysctl.conf
fs.file-max = 100000
net.core.somaxconn = 4096
net.ipv4.tcp_max_syn_backlog = 4096
net.core.netdev_max_backlog = 4096
net.ipv4.ip_local_port_range = 10000 65535 # postgresql.conf
max_connections = 200
shared_buffers = 8GB
effective_cache_size = 24GB
maintenance_work_mem = 1GB
work_mem = 50MBResult
After applying all six optimizations, the PostgreSQL‑Node.js stack consistently handled 50,000 transactions per second with stable latency and acceptable reliability on the target hardware.
Lessons Learned
Connection management is critical : reuse connections to avoid overhead.
Batching wins : group operations to amortize network and transaction costs.
CAP trade‑offs matter : disabling synchronous_commit gains throughput at a controlled durability cost.
Partitioning pays off : improves both write and read performance.
Avoid N+1 queries : use joins or aggregation to fetch related data in one round‑trip.
Hardware matters : SSDs, ample RAM, and proper OS tuning are essential for high‑throughput workloads.
Conclusion
Achieving 50k TPS with PostgreSQL and Node.js requires a holistic approach; no single tweak is sufficient. By combining connection pooling, batching, WAL tuning, partitioning, query refactoring, and system‑level adjustments, we reached the performance target while preserving data integrity.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
