Operations 7 min read

How to Throttle Read and Write Traffic in an Elasticsearch Cluster

The article explains why native Elasticsearch throttling is insufficient, introduces node‑level traffic control provided by Infinilabs Gateway, shows detailed configuration examples, parameter meanings, FAQ solutions, advanced tuning tips, and performance comparisons to protect clusters from overload.

Mingyi World Elasticsearch
Mingyi World Elasticsearch
Mingyi World Elasticsearch
How to Throttle Read and Write Traffic in an Elasticsearch Cluster

Why node‑level traffic control?

Limitations of native throttling

Elasticsearch’s built‑in throttling is weak; developers usually adjust bulk size and concurrent thread count, which only indirectly limits traffic. This cannot precisely control bandwidth—large documents can instantly saturate the network—and it lacks a global view, so overload on a single node can cause cluster‑wide performance swings.

Node‑level traffic control model

Infinilabs Gateway introduces a three‑dimensional node‑level throttling model based on four core parameters:

Traffic rate (QPS or bandwidth)

Concurrent connections

Abnormal circuit‑breaker

Wait‑queue management

Official documentation: https://docs.infinilabs.com/gateway/main/zh/docs/references/elasticsearch/

Configuration example (validated)

elasticsearch:
- name: prod
  enabled: true
  endpoint: https://127.0.0.1:9200
  basic_auth:
    username: elastic
    password: changeme
  traffic_control:
    enabled: true
    max_qps_per_node: 100   # per‑node max requests per second (test)
    max_bytes_per_node: 104857600   # 100 MB/s per node
    max_connection_per_node: 500   # max connections per node
    max_wait_time_in_ms: 5000   # queue timeout 5 s
entry:
- name: my_es_entry
  enabled: true
  router: my_router
  max_concurrency: 10000
  network:
    binding: 0.0.0.0:8000
    tls:
      enabled: true
- name: my_unsecure_es_entry
  enabled: true
  router: my_router
  max_concurrency: 10000
  network:
    binding: 0.0.0.0:8001
    tls:
      enabled: false
router:
- name: my_router
  enabled: true

Parameter reference

max_qps_per_node

(int): node‑level request‑rate ceiling; recommended value calculated based on node specifications. max_bytes_per_node (int): node‑level bandwidth limit in bytes; typical setting 70‑80 % of the network capacity. max_connection_per_node (int): limits concurrent connections to avoid connection storms; suggested CPU‑cores × 200. max_wait_time_in_ms (int): maximum time a request may stay in the buffer queue; default 10000 ms, can be reduced (e.g., 5000 ms) according to business tolerance.

Frequently asked questions

How to precisely control write rate?

traffic_control:
  max_qps_per_node: 500   # limit bulk operations per second
  max_bytes_per_node: 52428800   # limit to 50 MB/s write bandwidth

This combination prevents high‑frequency small‑document spikes and avoids sudden large‑document bursts.

What happens after throttling is triggered?

Requests enter a buffer queue (default maximum wait 10 s).

The traffic_control.max_wait_time_in_ms parameter defines the longest wait; setting it to 5000 yields a 5 s timeout.

Requests that exceed the timeout receive 429 Too Many Requests (customizable).

An abnormal circuit‑breaker can be combined with allow_access_when_master_not_found for node‑level failover.

Advanced tuning techniques

Dynamic weight allocation

Different workloads can be assigned differentiated quotas (example, not verified):

# Log cluster – focus on throughput
- name: logs-cluster
  traffic_control:
    max_bytes_per_node: 209715200

# Transaction cluster – focus on low latency
- name: order-cluster
  traffic_control:
    max_qps_per_node: 2000
    max_wait_time_in_ms: 2000

Monitoring integration

Metrics exposed by the gateway can be visualized in external dashboards to observe queue length, QPS, and bandwidth usage.

Effect comparison

Test scenario: a Python script generates high‑concurrency search requests (50 threads, 60 s) against index test_index_0618, targeting 1000 QPS. The script records successful and failed request counts, actual QPS, and achievement rate.

Without throttling the cluster exhibits spikes and occasional timeouts; after enabling node‑level throttling the QPS stabilizes, queue lengths stay within the configured limit, and latency improves.

Recommendation: perform baseline pressure testing in production, then iteratively fine‑tune the four parameters to match the observed capacity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Testingtraffic controlcluster stabilityInfinilabs Gatewaynode-level throttling
Mingyi World Elasticsearch
Written by

Mingyi World Elasticsearch

The leading WeChat public account for Elasticsearch fundamentals, advanced topics, and hands‑on practice. Join us to dive deep into the ELK Stack (Elasticsearch, Logstash, Kibana, Beats).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.