How TencentOS “Ruyi” Achieves Network QoS for Mixed Online/Offline Workloads
This article explains TencentOS “Ruyi” network QoS, detailing its resource isolation concepts, tc+htb and cgroup configurations, performance testing, and the advantages of the Ruyi netqos scheme over traditional tc solutions for mixed online and offline container workloads.
What Is Network Resource Isolation
Network resource isolation controls bandwidth based on task priority, ensuring that high‑priority tasks receive sufficient bandwidth when resources are limited. The overall scheme includes both egress (sending) and ingress (receiving) bandwidth isolation.
Existing tc+htb+cgroup Netcls Solution
The traditional approach uses tc with htb and cgroup
netcls.classidto differentiate online (high‑priority) and offline (low‑priority) containers. Two leaf classes are created: classid
0x10010for high priority and
0x10020for low priority.
The configuration assigns
tc_offlineto offline containers and
tc_onlineto online containers, linking
net_cls.classidwith tc to give online tasks higher priority while allowing offline tasks to share remaining bandwidth when online demand is low.
Configuration Steps
tc+htb Example : Create two leaf classes with priorities 0 (high) and 1 (low) and corresponding class IDs.
cgroup Classid Configuration :
By setting
net_cls.classidfor online containers to a high‑priority ID and offline containers to a low‑priority ID, the system can pre‑empt offline bandwidth when online tasks require it.
Test Results
During the first 5 seconds, online bandwidth is zero, allowing offline containers to use up to ~9 Gbit/s. At second 6, online tasks request bandwidth, pre‑empting offline traffic so that offline is limited to a guaranteed 100 Mbit/s.
The tc scheme satisfies basic bandwidth isolation requirements but incurs noticeable overhead.
Performance Overhead
The tc+htb implementation adds a global spin lock (
root_lock) on each packet, leading to about 14 % overhead at 380 k packets per second.
In contrast, the Ruyi netqos token‑bucket approach incurs less than 1 % overhead, and only 0.06 % at 800 k packets per second.
Sending Path Design
Hook into
netfilter LOCAL_OUTto locate the socket of each packet.
Check if the packet’s egress interface is subject to flow control; if not, allow it.
Associate the socket with its cgroup via
netclsto obtain priority and bandwidth settings.
If cgroup priority is 0 (high), bypass bandwidth limits.
If priority is 1‑7 (low), enforce a maximum bandwidth with a configurable minimum guarantee; idle bandwidth can be shared with higher‑priority containers.
If a low‑priority task exceeds its maximum, apply packet dropping and window reduction proportional to the excess.
Receiving Path Design
Hook into
netfilter LOCAL_INto locate the socket.
Perform QoS checks on the ingress interface similarly to the sending path.
Use cgroup
netclsto retrieve priority and bandwidth configuration.
High‑priority containers (priority 0) are not limited.
Low‑priority containers (priority 1‑7) have both maximum and minimum bandwidth guarantees; excess traffic triggers packet loss and window reduction, with a calculated attenuation factor
αapplied to the receive window.
Usage Steps
Set offline cgroup priority.
Configure offline minimum guaranteed bandwidth and maximum shared bandwidth.
Solution Effectiveness
Testing on a real business workload shows offline minimum guaranteed bandwidth of 100 Mbit/s and a maximum shared bandwidth of 1000 Mbit/s, with clear isolation results for both inbound and outbound traffic.
Comparison Summary
tc incurs about 13 % overhead, while Ruyi netqos stays below 1 %.
tc enforces a hard bandwidth cap that can cause packet loss for online tasks; Ruyi netqos is online‑friendly and avoids online packet loss.
tc configuration is complex, especially with many priority levels and bandwidth allocations.
tc only supports egress control; inbound control requires additional virtual interfaces or complex routing policies, adding overhead.
Tencent Architect
We share technical insights on storage, computing, and access, and explore industry-leading product technologies together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.