Cloud Native 10 min read

How TencentOS “Ruyi” Achieves Network QoS for Mixed Online/Offline Workloads

This article explains TencentOS “Ruyi” network QoS, detailing its resource isolation concepts, tc+htb and cgroup configurations, performance testing, and the advantages of the Ruyi netqos scheme over traditional tc solutions for mixed online and offline container workloads.

Tencent Architect
Tencent Architect
Tencent Architect
How TencentOS “Ruyi” Achieves Network QoS for Mixed Online/Offline Workloads

What Is Network Resource Isolation

Network resource isolation controls bandwidth based on task priority, ensuring that high‑priority tasks receive sufficient bandwidth when resources are limited. The overall scheme includes both egress (sending) and ingress (receiving) bandwidth isolation.

Existing tc+htb+cgroup Netcls Solution

The traditional approach uses tc with htb and cgroup

netcls.classid

to differentiate online (high‑priority) and offline (low‑priority) containers. Two leaf classes are created: classid

0x10010

for high priority and

0x10020

for low priority.

The configuration assigns

tc_offline

to offline containers and

tc_online

to online containers, linking

net_cls.classid

with tc to give online tasks higher priority while allowing offline tasks to share remaining bandwidth when online demand is low.

Configuration Steps

tc+htb Example : Create two leaf classes with priorities 0 (high) and 1 (low) and corresponding class IDs.

cgroup Classid Configuration :

By setting

net_cls.classid

for online containers to a high‑priority ID and offline containers to a low‑priority ID, the system can pre‑empt offline bandwidth when online tasks require it.

Test Results

During the first 5 seconds, online bandwidth is zero, allowing offline containers to use up to ~9 Gbit/s. At second 6, online tasks request bandwidth, pre‑empting offline traffic so that offline is limited to a guaranteed 100 Mbit/s.

The tc scheme satisfies basic bandwidth isolation requirements but incurs noticeable overhead.

Performance Overhead

The tc+htb implementation adds a global spin lock (

root_lock

) on each packet, leading to about 14 % overhead at 380 k packets per second.

In contrast, the Ruyi netqos token‑bucket approach incurs less than 1 % overhead, and only 0.06 % at 800 k packets per second.

Sending Path Design

Hook into

netfilter LOCAL_OUT

to locate the socket of each packet.

Check if the packet’s egress interface is subject to flow control; if not, allow it.

Associate the socket with its cgroup via

netcls

to obtain priority and bandwidth settings.

If cgroup priority is 0 (high), bypass bandwidth limits.

If priority is 1‑7 (low), enforce a maximum bandwidth with a configurable minimum guarantee; idle bandwidth can be shared with higher‑priority containers.

If a low‑priority task exceeds its maximum, apply packet dropping and window reduction proportional to the excess.

Receiving Path Design

Hook into

netfilter LOCAL_IN

to locate the socket.

Perform QoS checks on the ingress interface similarly to the sending path.

Use cgroup

netcls

to retrieve priority and bandwidth configuration.

High‑priority containers (priority 0) are not limited.

Low‑priority containers (priority 1‑7) have both maximum and minimum bandwidth guarantees; excess traffic triggers packet loss and window reduction, with a calculated attenuation factor

α

applied to the receive window.

Usage Steps

Set offline cgroup priority.

Configure offline minimum guaranteed bandwidth and maximum shared bandwidth.

Solution Effectiveness

Testing on a real business workload shows offline minimum guaranteed bandwidth of 100 Mbit/s and a maximum shared bandwidth of 1000 Mbit/s, with clear isolation results for both inbound and outbound traffic.

Comparison Summary

tc incurs about 13 % overhead, while Ruyi netqos stays below 1 %.

tc enforces a hard bandwidth cap that can cause packet loss for online tasks; Ruyi netqos is online‑friendly and avoids online packet loss.

tc configuration is complex, especially with many priority levels and bandwidth allocations.

tc only supports egress control; inbound control requires additional virtual interfaces or complex routing policies, adding overhead.

cloud nativecontainercgroupTencentOSResource Isolationtcnetwork QoS
Tencent Architect
Written by

Tencent Architect

We share technical insights on storage, computing, and access, and explore industry-leading product technologies together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.