Backend Development 23 min read

Evolution and Design of Bilibili Customer Service Seat Scheduling System

The article traces Bilibili’s customer‑service seat scheduling system from its initial balanced‑distribution algorithm and Redis‑based priority queues for live chat and ticket handling, through fairness‑focused saturation limits and virtual‑queue mechanisms, to planned dynamic tuning and expertise‑aware routing for future scalability.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Evolution and Design of Bilibili Customer Service Seat Scheduling System

This article provides a comprehensive technical analysis of the evolution of Bilibili's customer service seat scheduling system, covering both online chat and ticket‑based support. It explains why sophisticated scheduling is required, especially during large‑scale events that generate sudden spikes in traffic.

Background and Challenges

The core goal is to achieve better user experience with fewer service agents while avoiding waste of human resources. Challenges include heterogeneous agent skills, varying service durations, unpredictable traffic bursts, and the need to respect agents' work/rest cycles.

Online Seat Scheduling – Phase 1

Four primary allocation strategies were evaluated:

Balanced distribution – evenly spreads incoming requests among agents.

Familiar‑customer priority – routes a user to an agent who previously handled them.

Last‑service priority – assigns the request to the agent who served the user most recently.

Designated allocation – assigns based on custom business rules.

The balanced distribution was chosen as the default, implemented with a Strategy Pattern to allow easy switching to other policies.

Balanced Distribution Logic

The system compares the current load of two agents (A and B) within the same skill group. Allocation rules are:

If A’s active sessions are fewer than B’s and both are below their saturation limits, new requests go to A.

If the loads are equal and both are under the limits, the request is assigned randomly.

If an agent has reached its saturation, the request is given to any agent still below the limit.

If all agents are saturated, the request enters a queue.

Queue Management with Redis Zset

The queue is implemented using Redis sorted sets, enabling ordering by enqueue timestamp and fast rank queries. Typical commands are:

ZADD tobeallocated_ticket_list_{group_id} {weight} {ticket_id}

ZRANGE tobeallocated_ticket_list_{group_id} 0 N

ZREM tobeallocated_ticket_list_{group_id} {ticket_id}

ZSCORE tobeallocated_ticket_list_{group_id} {ticket_id}

Weight calculation combines a high‑order type component (release = 1, transfer = 2, explicit = 3, auto = 4) multiplied by 10¹⁰ and a low‑order component based on seconds elapsed since a fixed start time, ensuring both type priority and FIFO ordering.

Virtual Queue (Virtual Waiting Area)

To avoid wasting agent capacity during massive spikes, users beyond a configurable rank are moved to a virtual waiting area. The system probes these users; if they respond within a timeout, they are promoted back to the normal queue, otherwise they are dropped.

Ticket Seat Scheduling – Phase 1

Ticket handling differs from live chat: a ticket is created first, then assigned to an agent. Allocation must satisfy real‑time, accuracy, fairness, and priority requirements. The same Redis Zset structure is used, with weight‑based ordering to prioritize releases over automatic assignments.

Phase 2 Improvements

Fairness issues were observed when agents with earlier login times received a disproportionate share of tickets. The solution introduced “instant saturation” limits (maximum concurrent tickets) alongside daily caps. Allocation now checks both limits before assigning a ticket, smoothing load across the day and preventing early‑login agents from monopolizing work.

Phase 3 Outlook

Future enhancements include:

Adjusting virtual‑queue entry criteria based on actual waiting time rather than static rank.

Dynamic tuning of virtual‑queue ratios per skill group using recent session metrics.

Incorporating soft factors such as agent expertise, historical performance, and user feedback to further optimize ticket routing.

Conclusion

The described scheduling system demonstrates how a combination of balanced algorithms, Redis‑based priority queues, and adaptive virtual waiting mechanisms can handle high‑volume, bursty traffic while maintaining service quality and operational efficiency.

backend architectureLoad BalancingRedissystem designCustomer Servicequeue managementseat scheduling
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.