Databases 16 min read

Why We Dropped SQL for NoSQL: 5× Traffic Boost and Zero Downtime

Facing massive query latency, deadlocks and costly vertical scaling, our team abandoned a textbook‑perfect PostgreSQL setup, tried extensive SQL optimizations, added Redis caching and read replicas, and finally migrated critical order services to MongoDB, achieving five‑fold capacity, zero downtime and significant cost savings.

dbaplus Community
dbaplus Community
dbaplus Community
Why We Dropped SQL for NoSQL: 5× Traffic Boost and Zero Downtime

Background and Problem

Our flagship e‑commerce application crashed during traffic spikes, with query latency soaring to 2.5 seconds, order processing failures, and frequent deadlocks. Critics argued that NoSQL would break data integrity or that simple SQL tuning would suffice, but the system was already on the brink of failure.

Initial Architecture

PostgreSQL RDS for transactional data

Redis as a cache layer

Elasticsearch for search

Multiple read‑only replicas

Heavily indexed and optimized queries

Full‑stack monitoring with DataDog

Why Change Was Necessary

The PostgreSQL stack hit several bottlenecks:

Complex JOINs taking >1.5 seconds under load

Row‑level locks causing persistent deadlocks

Uncontrolled cost of vertical scaling

Frequent outages during traffic peaks

Engineering time spent firefighting instead of building features

Critical Failure Points

Monitoring revealed alarming metrics:

Average query time: 1.5 s+ (vs. 200 ms originally)

CPU usage: 89 %

IOPS: saturated

Cache hit rate: 65 % (down from 87 %)

Deadlock frequency: 6‑7 per minute

Failed Solutions

Attempt #1 – Query Optimization

We added composite indexes, materialized views and rewrote queries:

-- Added composite indexes
CREATE INDEX idx_orders_status_created ON orders(status, created_at);
CREATE INDEX idx_order_items_order_product ON order_items(order_id, product_id);

-- Materialized view for common queries
CREATE MATERIALIZED VIEW order_summaries AS
SELECT o.id,
       COUNT(i.id) AS items_count,
       SUM(p.price * i.quantity) AS total_amount
FROM orders o
JOIN order_items i ON o.id = i.order_id
JOIN products p ON i.product_id = p.id
GROUP BY o.id;

-- Query rewrite using CTE
WITH order_data AS (
  SELECT o.id, o.status, o.created_at,
         c.name, c.email
  FROM orders o
  JOIN customers c ON o.customer_id = c.id
  WHERE o.status = 'processing'
    AND o.created_at > NOW() - INTERVAL '24 HOURS'
)
SELECT od.*, os.items_count, os.total_amount
FROM order_data od
JOIN order_summaries os ON od.id = os.id;

Result: query time improved to ~800 ms, still insufficient.

Attempt #2 – Redis Caching

We introduced aggressive caching with a 5‑minute TTL and cache warm‑up jobs:

// Redis caching layer
const getOrderDetails = async (orderId) => {
  const cacheKey = `order:${orderId}:details`;
  let orderDetails = await redis.get(cacheKey);
  if (orderDetails) return JSON.parse(orderDetails);
  orderDetails = await db.query(ORDER_DETAILS_QUERY, [orderId]);
  await redis.setex(cacheKey, 300, JSON.stringify(orderDetails));
  return orderDetails;
};

// Cache invalidation on updates
const updateOrder = async (orderId, data) => {
  await db.query(UPDATE_ORDER_QUERY, [data, orderId]);
  await redis.del(`order:${orderId}:details`);
};

// Warm cache for active orders
const warmOrderCache = async () => {
  const activeOrders = await db.query(`SELECT id FROM orders WHERE status IN ('processing','shipped') AND created_at > NOW() - INTERVAL '24 HOURS'`);
  await Promise.all(activeOrders.map(order => getOrderDetails(order.id)));
};
cron.schedule('*/5 * * * *', warmOrderCache);

Result: latency improved, but cache misses under high load created a new bottleneck.

Attempt #3 – Read Replicas

We expanded to five read‑only replicas and added a simple load‑balancer:

// Database connection pool with read‑write split
const pool = {
  write: new Pool({ host: 'master.database.aws', max: 20, min: 5 }),
  read: new Pool({ hosts: [
    'replica1.database.aws',
    'replica2.database.aws',
    'replica3.database.aws',
    'replica4.database.aws',
    'replica5.database.aws'
  ], max: 50, min: 10 })
};

const getReadConnection = () => {
  const replicaIndex = Math.floor(Math.random() * 5);
  return pool.read.connect(replicaIndex);
};

const executeQuery = async (query, params, queryType = 'read') => {
  const connection = queryType === 'write' ? await pool.write.connect() : await getReadConnection();
  try { return await connection.query(query, params); }
  finally { connection.release(); }
};

Result: replication lag during peak traffic made the approach untenable.

Switch to NoSQL (MongoDB)

After three months of failed attempts, we migrated the most complex order‑processing service to MongoDB, designing a document model that captures order, customer, items, payment and shipping information.

// MongoDB order document model
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  status: "processing",
  created_at: ISODate("2024-02-07T10:00:00Z"),
  customer: {
    _id: ObjectId("507f1f77bcf86cd799439012"),
    name: "John Doe",
    email: "[email protected]",
    shipping_address: {
      street: "123 Main St",
      city: "San Francisco",
      country: "USA"
    }
  },
  items: [{
    product_id: ObjectId("507f1f77bcf86cd799439013"),
    title: "Gaming Laptop",
    price: 1299.99,
    quantity: 1,
    variants: { color: "black", size: "15-inch" }
  }],
  payment: { method: "credit_card", status: "completed", amount: 1299.99 },
  shipping: { method: "express", tracking_number: "1Z999AA1234567890", estimated_delivery: ISODate("2024-02-10T10:00:00Z") },
  metadata: { user_agent: "Mozilla/5.0...", ip_address: "192.168.1.1" }
}

Result: the same query that took 2.3 seconds in PostgreSQL now executes in ~200 ms.

We enforced data integrity with JSON schema validation:

db.createCollection("orders", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["customer", "items", "status", "created_at"],
      properties: {
        customer: {
          bsonType: "object",
          required: ["name", "email"],
          properties: {
            name: { bsonType: "string" },
            email: { bsonType: "string" }
          }
        },
        items: {
          bsonType: "array",
          items: {
            bsonType: "object",
            required: ["product_id", "price", "quantity"],
            properties: {
              product_id: { bsonType: "objectId" },
              price: { bsonType: "double" },
              quantity: { bsonType: "int" }
            }
          }
        }
      }
    }
  }
});

We added indexes to support common queries:

db.orders.createIndex({ created_at: 1, status: 1 });
db.orders.createIndex({ "customer.email": 1 });
db.orders.createIndex({ "items.product_id": 1 });

Migration Strategy – Dual‑Write

To keep PostgreSQL and MongoDB in sync during migration, we implemented a dual‑write service that writes to both stores inside a MongoDB transaction and verifies consistency via checksums:

class OrderService {
  async createOrder(orderData) {
    const session = await mongoose.startSession();
    session.startTransaction();
    const mongoOrder = await this.createMongoOrder(orderData, session);
    const pgOrder = await this.createPostgresOrder(orderData);
    if (!this.verifyOrderConsistency(mongoOrder, pgOrder)) {
      throw new Error('Data inconsistency detected');
    }
    await session.commitTransaction();
    return mongoOrder;
  }

  private async verifyOrderConsistency(mongoOrder, pgOrder) {
    const checksums = await Promise.all([
      this.calculateChecksum(mongoOrder),
      this.calculateChecksum(pgOrder)
    ]);
    return checksums[0] === checksums[1];
  }
}

Real‑time Monitoring

We leveraged MongoDB change streams and periodic stats collection to feed DataDog and alert on critical conditions:

// Change‑stream monitoring
const monitorOrderChanges = async () => {
  const changeStream = db.collection('orders').watch();
  changeStream.on('change', async change => {
    const metrics = {
      operation_type: change.operationType,
      execution_time: change.clusterTime.getTime() - change.operationTime.getTime(),
      collection: 'orders'
    };
    await datadog.gauge('mongodb.operation', metrics);
    if (change.operationType === 'update' && change.updateDescription.updatedFields.status === 'failed') {
      await slack.sendAlert({ channel: '#db-alerts', text: `Order ${change.documentKey._id} failed processing`, level: 'critical' });
    }
  });
};

// Periodic performance metrics
const monitorPerformance = async () => {
  while (true) {
    const stats = await db.collection('orders').stats();
    await Promise.all([
      datadog.gauge('mongodb.size', stats.size),
      datadog.gauge('mongodb.count', stats.count),
      datadog.gauge('mongodb.avgObjSize', stats.avgObjSize)
    ]);
    await sleep(60000);
  }
};

Results

After three months of migration:

Zero downtime during Black Friday, handling >3× normal traffic.

Development velocity increased by 57 %.

Customer satisfaction score rose by 42 %.

Eliminated $110 k/month revenue loss and added $75 k/month new revenue.

Engineering morale dramatically improved.

Cost breakdown of the previous PostgreSQL setup was $5,750 /month, with 56 % for instances, 20 % storage, 16 % network, and 8 % ancillary services.

Lessons Learned

Key takeaways:

Start with a smaller, less critical service when experimenting with a new data model.

Invest early in team training for the new paradigm.

Build robust monitoring from day one to catch performance regressions early.

NoSQL is not a magic bullet; it solved our specific read‑heavy, transaction‑intensive workload.

SQL remains valuable, but a hybrid approach can yield the best results for large‑scale systems.

Result chart
Result chart
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Optimizationdatabase migrationMongoDBNoSQLcost analysis
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.