Operations 12 min read

How TDMQ Pulsar’s Cluster‑Level and Topic‑Partition Throttling Keeps Your Messaging System Stable

This article explains why high‑throughput producers and consumers can saturate CPU, memory, network and disk I/O in TDMQ Pulsar clusters, describes the built‑in cluster‑level distributed and topic‑partition rate‑limiting mechanisms, and provides practical guidance for configuration, monitoring, and troubleshooting.

Tencent Cloud Middleware

Apr 9, 2025

How TDMQ Pulsar’s Cluster‑Level and Topic‑Partition Throttling Keeps Your Messaging System Stable

Why Throttling Is Needed

In high‑throughput scenarios, producers and consumers can saturate CPU, memory, network and disk I/O, causing latency spikes, message loss or cluster paralysis. Throttling protects cluster resources and maintains global stability.

Throttling Mechanisms

Cluster‑level Distributed Throttling

Applies to professional clusters. A 1‑second sliding window limits the number of messages processed per second. Example: TPS limit 100 means at most 100 messages are processed in any 1‑second window; excess requests are delayed until the window resets.

Producer impact: increased send latency, possible timeouts.

Consumer impact: end‑to‑end delay, possible backlog.

Topic‑partition Throttling

Runs on every partition with a timer every 50 ms. The server checks whether the 1‑second quota is exceeded. When exhausted, the producer’s read channel is closed for up to one second; consumers are similarly blocked.

Practical Guidance

Choose appropriate cluster specifications : Analyze peak production/consumption, set proportional throttling ratios, and perform load testing before launch.

Avoid incorrect delayed‑message settings : Do not set DeliverAfter or DeliverAt for non‑delayed messages; any value triggers delayed‑message accounting.

Configure alerts : Trigger warnings when production/consumption rates or bandwidth exceed 80 % of the quota, and when throttling count rises.

Scale partitions : If a topic’s TPS or bandwidth approaches the per‑partition limit, increase the number of partitions.

Common Phenomena and Q&A

Q1: Why does throttling occur even when the minute‑level metrics are below the spec? Throttling is enforced per second. Traffic spikes within a second can exceed the 1‑second quota while the minute‑average remains low.

Q2: Why can instantaneous production/consumption exceed the instance spec? In a distributed cluster each broker enforces its own quota. The combined instantaneous capacity may temporarily surpass the overall spec.

Q3: How to detect throttling? In the cluster monitoring console, a throttling count greater than zero indicates that throttling has occurred.

Implementation Details

Producer Side

The server tracks a 1‑second window. When the quota is exhausted, the server closes the producer’s TCP channel. The channel is reopened at the start of the next window, allowing pending requests to be processed.

Consumer Side

When the consumption quota is exhausted, the server stops pushing messages for up to one second. Consumers experience increased end‑to‑end latency and possible message backlog.

Summary

Understanding the 1‑second sliding‑window, soft‑limit throttling model for both cluster‑level and topic‑partition dimensions enables users to configure appropriate limits, set proactive alerts, and scale resources to maintain stable, high‑performance messaging under heavy load.

operations Message Queue Pulsar cluster management TDMQ

Written by

Tencent Cloud Middleware

Official account of Tencent Cloud Middleware. Focuses on microservices, messaging middleware and other cloud‑native technology trends, publishing product updates, case studies, and technical insights. Regularly hosts tech salons to share effective solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.