Improving Load Balancing for a Compute‑Intensive Ticket Query Engine with a Pooling Strategy

The article analyzes why a round‑robin load‑balancing approach caused severe response‑time spikes in Ctrip's compute‑intensive international ticket query engine and demonstrates how switching to a proactive pooling model using a Redis‑backed queue eliminated the spikes and reduced average latency by about 20%.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Improving Load Balancing for a Compute‑Intensive Ticket Query Engine with a Pooling Strategy

Background

In a compute‑intensive service, each request can consume all CPU cores. When two requests arrive on the same server, they compete for CPU, causing longer average processing times. Traditional load‑balancing methods such as round‑robin or random ignore server load, leading to multiple concurrent requests on a single machine and degraded service quality.

After a recent refactor, Ctrip's international ticket query engine became fully compute‑intensive with a maximum concurrency of one, yet the load‑balancer remained round‑robin. Monitoring showed persistent response‑time spikes caused by a small number of long‑running “A‑type” requests (several seconds) that blocked dozens of short‑running “B‑type” requests (tens of milliseconds), creating severe latency spikes.

Pooling Solution

To address the issue, a new load‑balancing strategy called pooling was introduced. Instead of passively receiving requests, servers actively pull requests from a global queue, ensuring that each server processes at most one request at a time.

The pooling architecture consists of three roles:

submitor : receives external calls, enqueues requests to the queue, and forwards worker results back to the caller.

queue : a globally unique Redis list used as a buffer; lpush adds requests, brpop blocks workers until a request is available.

worker : continuously loops to brpop a request, processes it, and returns the result to the submitor.

This design guarantees that a server is either processing a request or waiting for one, eliminating the contention that caused spikes.

Effect

After switching from round‑robin to pooling, the average response time dropped by roughly 20% and the long‑tail spikes disappeared completely.

Round‑robin mode:

Pooling mode:

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backendload balancingredisPooling
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.