Cloud Native 10 min read

How TencentOS “Ruyi” Achieves Network QoS for Mixed Online/Offline Workloads

This article explains TencentOS “Ruyi” network QoS, detailing its resource isolation concepts, tc+htb and cgroup configurations, performance testing, and the advantages of the Ruyi netqos scheme over traditional tc solutions for mixed online and offline container workloads.

Tencent Architect
Tencent Architect
Tencent Architect
How TencentOS “Ruyi” Achieves Network QoS for Mixed Online/Offline Workloads

What Is Network Resource Isolation

Network resource isolation controls bandwidth based on task priority, ensuring that high‑priority tasks receive sufficient bandwidth when resources are limited. The overall scheme includes both egress (sending) and ingress (receiving) bandwidth isolation.

Existing tc+htb+cgroup Netcls Solution

The traditional approach uses tc with htb and cgroup netcls.classid to differentiate online (high‑priority) and offline (low‑priority) containers. Two leaf classes are created: classid 0x10010 for high priority and 0x10020 for low priority.

The configuration assigns tc_offline to offline containers and tc_online to online containers, linking net_cls.classid with tc to give online tasks higher priority while allowing offline tasks to share remaining bandwidth when online demand is low.

Configuration Steps

tc+htb Example : Create two leaf classes with priorities 0 (high) and 1 (low) and corresponding class IDs.

cgroup Classid Configuration :

By setting net_cls.classid for online containers to a high‑priority ID and offline containers to a low‑priority ID, the system can pre‑empt offline bandwidth when online tasks require it.

Test Results

During the first 5 seconds, online bandwidth is zero, allowing offline containers to use up to ~9 Gbit/s. At second 6, online tasks request bandwidth, pre‑empting offline traffic so that offline is limited to a guaranteed 100 Mbit/s.

The tc scheme satisfies basic bandwidth isolation requirements but incurs noticeable overhead.

Performance Overhead

The tc+htb implementation adds a global spin lock ( root_lock) on each packet, leading to about 14 % overhead at 380 k packets per second.

In contrast, the Ruyi netqos token‑bucket approach incurs less than 1 % overhead, and only 0.06 % at 800 k packets per second.

Sending Path Design

Hook into netfilter LOCAL_OUT to locate the socket of each packet.

Check if the packet’s egress interface is subject to flow control; if not, allow it.

Associate the socket with its cgroup via netcls to obtain priority and bandwidth settings.

If cgroup priority is 0 (high), bypass bandwidth limits.

If priority is 1‑7 (low), enforce a maximum bandwidth with a configurable minimum guarantee; idle bandwidth can be shared with higher‑priority containers.

If a low‑priority task exceeds its maximum, apply packet dropping and window reduction proportional to the excess.

Receiving Path Design

Hook into netfilter LOCAL_IN to locate the socket.

Perform QoS checks on the ingress interface similarly to the sending path.

Use cgroup netcls to retrieve priority and bandwidth configuration.

High‑priority containers (priority 0) are not limited.

Low‑priority containers (priority 1‑7) have both maximum and minimum bandwidth guarantees; excess traffic triggers packet loss and window reduction, with a calculated attenuation factor α applied to the receive window.

Usage Steps

Set offline cgroup priority.

Configure offline minimum guaranteed bandwidth and maximum shared bandwidth.

Solution Effectiveness

Testing on a real business workload shows offline minimum guaranteed bandwidth of 100 Mbit/s and a maximum shared bandwidth of 1000 Mbit/s, with clear isolation results for both inbound and outbound traffic.

Comparison Summary

tc incurs about 13 % overhead, while Ruyi netqos stays below 1 %.

tc enforces a hard bandwidth cap that can cause packet loss for online tasks; Ruyi netqos is online‑friendly and avoids online packet loss.

tc configuration is complex, especially with many priority levels and bandwidth allocations.

tc only supports egress control; inbound control requires additional virtual interfaces or complex routing policies, adding overhead.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Containercgrouptencentostcnetwork QoS
Tencent Architect
Written by

Tencent Architect

We share technical insights on storage, computing, and access, and explore industry-leading product technologies together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.