How TencentOS “Ruyi” Achieves Network QoS for Mixed Online/Offline Workloads
This article explains TencentOS “Ruyi” network QoS, detailing its resource isolation concepts, tc+htb and cgroup configurations, performance testing, and the advantages of the Ruyi netqos scheme over traditional tc solutions for mixed online and offline container workloads.
What Is Network Resource Isolation
Network resource isolation controls bandwidth based on task priority, ensuring that high‑priority tasks receive sufficient bandwidth when resources are limited. The overall scheme includes both egress (sending) and ingress (receiving) bandwidth isolation.
Existing tc+htb+cgroup Netcls Solution
The traditional approach uses tc with htb and cgroup netcls.classid to differentiate online (high‑priority) and offline (low‑priority) containers. Two leaf classes are created: classid 0x10010 for high priority and 0x10020 for low priority.
The configuration assigns tc_offline to offline containers and tc_online to online containers, linking net_cls.classid with tc to give online tasks higher priority while allowing offline tasks to share remaining bandwidth when online demand is low.
Configuration Steps
tc+htb Example : Create two leaf classes with priorities 0 (high) and 1 (low) and corresponding class IDs.
cgroup Classid Configuration :
By setting net_cls.classid for online containers to a high‑priority ID and offline containers to a low‑priority ID, the system can pre‑empt offline bandwidth when online tasks require it.
Test Results
During the first 5 seconds, online bandwidth is zero, allowing offline containers to use up to ~9 Gbit/s. At second 6, online tasks request bandwidth, pre‑empting offline traffic so that offline is limited to a guaranteed 100 Mbit/s.
The tc scheme satisfies basic bandwidth isolation requirements but incurs noticeable overhead.
Performance Overhead
The tc+htb implementation adds a global spin lock ( root_lock) on each packet, leading to about 14 % overhead at 380 k packets per second.
In contrast, the Ruyi netqos token‑bucket approach incurs less than 1 % overhead, and only 0.06 % at 800 k packets per second.
Sending Path Design
Hook into netfilter LOCAL_OUT to locate the socket of each packet.
Check if the packet’s egress interface is subject to flow control; if not, allow it.
Associate the socket with its cgroup via netcls to obtain priority and bandwidth settings.
If cgroup priority is 0 (high), bypass bandwidth limits.
If priority is 1‑7 (low), enforce a maximum bandwidth with a configurable minimum guarantee; idle bandwidth can be shared with higher‑priority containers.
If a low‑priority task exceeds its maximum, apply packet dropping and window reduction proportional to the excess.
Receiving Path Design
Hook into netfilter LOCAL_IN to locate the socket.
Perform QoS checks on the ingress interface similarly to the sending path.
Use cgroup netcls to retrieve priority and bandwidth configuration.
High‑priority containers (priority 0) are not limited.
Low‑priority containers (priority 1‑7) have both maximum and minimum bandwidth guarantees; excess traffic triggers packet loss and window reduction, with a calculated attenuation factor α applied to the receive window.
Usage Steps
Set offline cgroup priority.
Configure offline minimum guaranteed bandwidth and maximum shared bandwidth.
Solution Effectiveness
Testing on a real business workload shows offline minimum guaranteed bandwidth of 100 Mbit/s and a maximum shared bandwidth of 1000 Mbit/s, with clear isolation results for both inbound and outbound traffic.
Comparison Summary
tc incurs about 13 % overhead, while Ruyi netqos stays below 1 %.
tc enforces a hard bandwidth cap that can cause packet loss for online tasks; Ruyi netqos is online‑friendly and avoids online packet loss.
tc configuration is complex, especially with many priority levels and bandwidth allocations.
tc only supports egress control; inbound control requires additional virtual interfaces or complex routing policies, adding overhead.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Architect
We share technical insights on storage, computing, and access, and explore industry-leading product technologies together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
