How TDMQ Pulsar’s Cluster‑Level and Topic‑Partition Throttling Keeps Your Messaging System Stable
This article explains why high‑throughput producers and consumers can saturate CPU, memory, network and disk I/O in TDMQ Pulsar clusters, describes the built‑in cluster‑level distributed and topic‑partition rate‑limiting mechanisms, and provides practical guidance for configuration, monitoring, and troubleshooting.
Why Throttling Is Needed
In high‑throughput scenarios, producers and consumers can saturate CPU, memory, network and disk I/O, causing latency spikes, message loss or cluster paralysis. Throttling protects cluster resources and maintains global stability.
Throttling Mechanisms
Cluster‑level Distributed Throttling
Applies to professional clusters. A 1‑second sliding window limits the number of messages processed per second. Example: TPS limit 100 means at most 100 messages are processed in any 1‑second window; excess requests are delayed until the window resets.
Producer impact: increased send latency, possible timeouts.
Consumer impact: end‑to‑end delay, possible backlog.
Topic‑partition Throttling
Runs on every partition with a timer every 50 ms. The server checks whether the 1‑second quota is exceeded. When exhausted, the producer’s read channel is closed for up to one second; consumers are similarly blocked.
Practical Guidance
Choose appropriate cluster specifications : Analyze peak production/consumption, set proportional throttling ratios, and perform load testing before launch.
Avoid incorrect delayed‑message settings : Do not set DeliverAfter or DeliverAt for non‑delayed messages; any value triggers delayed‑message accounting.
Configure alerts : Trigger warnings when production/consumption rates or bandwidth exceed 80 % of the quota, and when throttling count rises.
Scale partitions : If a topic’s TPS or bandwidth approaches the per‑partition limit, increase the number of partitions.
Common Phenomena and Q&A
Q1: Why does throttling occur even when the minute‑level metrics are below the spec? Throttling is enforced per second. Traffic spikes within a second can exceed the 1‑second quota while the minute‑average remains low.
Q2: Why can instantaneous production/consumption exceed the instance spec? In a distributed cluster each broker enforces its own quota. The combined instantaneous capacity may temporarily surpass the overall spec.
Q3: How to detect throttling? In the cluster monitoring console, a throttling count greater than zero indicates that throttling has occurred.
Implementation Details
Producer Side
The server tracks a 1‑second window. When the quota is exhausted, the server closes the producer’s TCP channel. The channel is reopened at the start of the next window, allowing pending requests to be processed.
Consumer Side
When the consumption quota is exhausted, the server stops pushing messages for up to one second. Consumers experience increased end‑to‑end latency and possible message backlog.
Summary
Understanding the 1‑second sliding‑window, soft‑limit throttling model for both cluster‑level and topic‑partition dimensions enables users to configure appropriate limits, set proactive alerts, and scale resources to maintain stable, high‑performance messaging under heavy load.
Tencent Cloud Middleware
Official account of Tencent Cloud Middleware. Focuses on microservices, messaging middleware and other cloud‑native technology trends, publishing product updates, case studies, and technical insights. Regularly hosts tech salons to share effective solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
