How to Quickly Resolve Massive Message Queue Backlogs and Expiration Issues

This article analyzes common production problems such as delayed or expired messages, full queues, and massive backlogs in message‑queue systems, and provides step‑by‑step emergency scaling and recovery strategies, including temporary consumer deployment and data re‑injection techniques.

Java Backend Technology
Java Backend Technology
Java Backend Technology
How to Quickly Resolve Massive Message Queue Backlogs and Expiration Issues

1. Interview Question

How to solve message queue delay and expiration problems? What to do when the queue is full and millions of messages are backlogged for hours?

2. Interviewer's Perspective

The question targets scenarios where the consumer side fails or processes extremely slowly, causing the queue's disk to fill up and messages to expire (e.g., RabbitMQ TTL). Such situations are common in production.

3. Analysis and Solutions

Assume the consumer crashes and a huge amount of messages accumulate in the MQ.

Problem 1: Massive backlog for several hours

Example: tens of millions of messages stuck from 4 pm to 10 pm. Restoring the consumer alone may take hours. Typical consumer processes 1,000 messages per second; three consumers handle 3,000 per second, about 180,000 per minute, over ten million per hour.

Solution: temporary emergency scaling:

Fix the consumer issue, then stop all existing consumers.

Create a new topic with partitions ten times the original size (or twenty times).

Deploy a temporary consumer that reads the backlogged data and writes it directly to the new enlarged queues without time‑consuming processing.

Allocate ten times more machines to run these temporary consumers, each consuming from a separate temporary queue.

This effectively expands queue and consumer resources by tenfold, achieving ten‑times normal throughput.

After the backlog is cleared, revert to the original architecture and consumers.

Problem 2: Message expiration (TTL) loss

If using RabbitMQ with TTL, messages that stay in the queue beyond the set time are discarded. In this case, scaling consumers does not help because data is already lost. A possible remedy is to write a temporary program that re‑extracts the lost data and re‑injects it into the queue after peak hours.

Example: 10,000 orders backlogged, 1,000 lost; manually retrieve and resend them.

Problem 3: Queue nearing full disk

If the queue’s disk is almost full, the only viable approach is the same rapid‑consume‑and‑discard strategy followed by the TTL‑based re‑injection after hours.

Feel free to share additional ideas in the comments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

TTLRabbitMQConsumerBacklog
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.