Backend Development 30 min read

Optimizing Distributed Queue Programming: Producer, Consumer & Middleware Tips

This article examines practical optimization techniques for distributed queue systems, covering producer throughput and persistence improvements, cache and batch‑write strategies, middleware selection criteria, consumer ordering and serialization challenges, singleton service trade‑offs, leader‑election designs, and deduplication state‑machine implementations.

Meituan Technology Team

Aug 5, 2016

Optimizing Distributed Queue Programming: Producer, Consumer & Middleware Tips

Producer Optimization

In a distributed queue, the producer often acts as a process‑forward node, receiving upstream data, processing it, and forwarding the result to the queue. This pattern creates two main concerns: throughput and persistence.

Throughput Issue

The producer must balance incoming request rate, processing capacity, and outbound queue speed to avoid overflow. When the queue becomes a bottleneck, threads accumulate, consuming memory and potentially causing a cascade failure.

Cache Optimization

To mitigate overflow, a memory cache can be introduced between request receipt and processing. Requests are stored in memory (sacrificing immediate consistency) and processed by a pool of background threads, dramatically reducing per‑request memory overhead and increasing overall throughput. This follows a CAP‑inspired trade‑off: lower consistency for higher availability.

Batch Write Optimization

When a producer writes to the queue, network latency, queue performance, and acknowledgment overhead can limit throughput. Batching multiple messages into a single write reduces interaction count, network traffic, and acknowledgment cost, especially for high‑throughput, low‑payload messages.

Persistence Optimization

Because systems may crash, sensitive data (e.g., billing or ticket orders) must be persisted. Two common persistence triggers are time‑based (periodic) and volume‑based (when cached request count exceeds a threshold). The choice depends on data sensitivity, request volume, and persistence latency.

Middleware Selection

Message middleware provides the core capabilities needed for distributed queues: message reception, distribution, storage, and retrieval. When choosing a middleware, engineers should consider functionality, performance (throughput and latency), reliability (high‑availability broker replication, persistent storage), client language support, and delivery policies (at‑most‑once, at‑least‑once, exactly‑once). Configuration options such as acknowledgment mechanisms, batch processing, partitioning, and persistence directly affect these metrics.

Consumer Optimization

Consumers are the true data‑processing side of a distributed queue and face challenges around ordering, serialization, frequency control, and integrity.

Ordering

When multiple consumers process messages that must be applied in strict order, a naïve multi‑consumer design can cause newer states to be overwritten by older ones. Ensuring ordered processing often requires an active/passive leader election so that only one instance handles ordered streams at a time.

Serialization

Serializability guarantees atomic, recoverable operations. Achieving it across distributed consumers is complex and typically requires external coordination rather than reinventing the wheel.

Frequency Control

Limiting consumption frequency can reduce cost (e.g., paid downstream services) and protect downstream systems from overload. Strategies include rate limiting and batching.

Integrity and Consistency

Multi‑threaded or multi‑process consumers increase the risk of data races and inconsistency. Proper locking, idempotent processing, and careful state management are essential.

Singleton Service Optimization

Deploying a single consumer instance eliminates ordering, serialization, and integrity problems, but reduces availability. Redundancy via active/passive leader election (e.g., ZooKeeper‑based) can restore high availability while preserving the benefits of a singleton.

Leader Election Architecture

Typical algorithms include Paxos and ZooKeeper’s ZAB. A common pattern is to create an EPHEMERAL SEQUENCE znode under an election path, watch the predecessor node, and assume leadership when the smallest node exists.

Let ELECTION be a path of choice of the application. To volunteer to be a leader:
1. Create znode z with path "ELECTION/guid-n_" with both SEQUENCE and EPHEMERAL flags;
2. Let C be the children of "ELECTION", and i be the sequence number of z;
3. Watch for changes on "ELECTION/guid-n_j", where j is the largest sequence number such that j < i and n_j is a znode in C;
Upon receiving a notification of znode deletion:
1. Let C be the new set of children of ELECTION;
2. If z is the smallest node in C, then execute leader procedure;
3. Otherwise, watch for changes on "ELECTION/guid-n_j" ...

Leader Handover Architecture

During a handover, learners (non‑leader instances) receive the new leader token. The former leader must finish in‑flight work before stepping down, while the new leader performs initialization. This three‑phase process (Inauguration, Execution, HandOver) mirrors real‑world election transitions.

An interface such as ILeaderCareer can encapsulate these phases:

public interface ILeaderCareer {
    public void inaugurate();
    public void handOver();
    public boolean execute();
}

Deduplication Optimization

High‑frequency duplicate requests waste resources. Deduplication targets the high‑repeat consumption model where many consumers may process the same request simultaneously.

Model

Requests are classified by repeat length: no repeat, sparse repeat (repeat length > queue length), and high repeat (repeat length < queue length). Only the high‑repeat model benefits from deduplication.

Deduplication State Machine

The state machine has four states: Init, Process, Block, Decline. It enforces two goals: processing uniqueness (only one consumer handles a request) and waiting uniqueness (only one consumer waits for the same request).

State transitions:

Init → Process on first enQueue (head request).

Process → Block when a second consumer attempts enQueue for the same request.

Block → Process when the head finishes ( deQueue).

Block → Decline if a third consumer tries to enqueue, which is then rejected.

synchronized ActionEnum enQueue(long id) {
    if (tail != INIT_QUEUE_ID) {
        return DECLINE;
    }
    if (head == INIT_QUEUE_ID) {
        head = id;
        return ACCEPT;
    } else {
        tail = id;
        return BLOCK;
    }
}

synchronized boolean deQueue(long id) {
    head = tail;
    tail = INIT_QUEUE_ID;
    return true;
}

Implementation

A QueueCoordinator<T> manages all per‑request state machines using a ConcurrentHashMap. Enqueue retrieves or creates the state machine, then invokes its enQueue. Depending on the returned enum, the business layer either processes, discards, or waits for the request.

public boolean enQueue(T key) {
    _loggingStastic();
    TrafficAutomate trafficAutomate = key2Status.get(key);
    if (trafficAutomate == null) {
        trafficAutomate = new TrafficAutomate();
        TrafficAutomate old = key2Status.putIfAbsent(key, trafficAutomate);
        if (old != null) trafficAutomate = old;
    }
    long threadId = Thread.currentThread().getId();
    ActionEnum action = trafficAutomate.enQueue(threadId);
    if (action == DECLINE) return false;
    if (action == ACCEPT) return true;
    long start = System.currentTimeMillis();
    while (System.currentTimeMillis() - start <= timeout) {
        _nonExceptionSleep(NAP_TIME_IN_MILL);
        if (trafficAutomate.isHead(threadId)) return true;
    }
    trafficAutomate.evictHeadByForce(threadId);
    return true;
}

public void deQueue(T key) {
    TrafficAutomate trafficAutomate = key2Status.get(key);
    if (trafficAutomate == null) {
        logger.error("key {} doesn't exist ", key);
        return;
    }
    long threadId = Thread.currentThread().getId();
    trafficAutomate.deQueue(threadId);
}

Illustrative Diagrams

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Performance Optimization State Machine Leader Election Middleware Selection distributed queues

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.