Implementing Hierarchical QoS IP‑Port Granular Offload Rate Limiting with OVS and DPDK
This article explains how to design and implement hardware‑offloaded, IP‑ and port‑granular traffic shaping using Open vSwitch, OpenFlow meters, Mellanox HQoS, and DPDK, covering software limits, meter hierarchy, flow‑meter association, deletion workflows, RCU handling, and same‑host traffic considerations.
Design Overview
We organize offloaded flow tables using table jumps and flow‑priority to capture hardware‑offloaded traffic, then apply hierarchical meters for IP‑ and port‑level rate limiting. The design includes software‑based VHU (virtual host) rate limiting, IP‑granular meter offload, and a combined IP‑to‑port meter hierarchy.
Software Rate Limiting
OVS already supports BPS‑based policing on VHU ports. Currently the limit is per VHU port; we plan to add IP‑granular limits by configuring qos parameters and using DPDK’s single‑rate three‑color marker (SRTCM) for both ingress and egress.
# ovs-appctl qos/show vhu81d8dba3-96
QoS: vhu81d8dba3-96 egress-policer
cbs: 51200000
cir: 51200000
# ovs-appctl qos/show-types vhu81d8dba3-96
QoS type: trtcm-policerIP‑Granular Metering
In the Neutron control plane we issue OpenFlow rules with IP‑based meters. When offloading, we create a matching IP‑granular meter and associate it with the flow, ensuring the offload hierarchy IP → Port.
ovs-ofctl -O OpenFlow13 add-meter br-int "meter=1,kbps,bands=type=drop,rate=10000"
ovs-ofctl add-flow br-int "table=0, priority=6, ip,in_port=2,dl_vlan=3,nw_dst=172.16.12.245 actions=meter:1,set_field:fa:16:3e:63:47:12->eth_dst,resubmit(,60)"
ovs-ofctl -O OpenFlow13 mod-meter br-int "meter=1,kbps,burst,bands=type=drop,rate=2000"HQoS Hierarchical QoS
Mellanox ConnectX‑6/7 NICs support Hierarchical QoS (HQoS) with meter hierarchy. HQoS enables multi‑level shaping (Port → Port‑Group → VM → Network). Green and yellow actions forward traffic, red drops it.
Using HQoS for IP+Port Limits
We first create a termination‑type port‑level meter (parent) and then an IP‑level child meter. The child’s green/yellow actions point to the parent, while red still drops. Finally, an IP‑matching flow references the child meter.
Implementation Details
Meter Hierarchy Design
For egress VM traffic the last action is jump TABLE_OUTPUT; for ingress VM traffic it is output to the VHU port. Port‑level meters are termination type with actions:
Green/Yellow : output (ingress) or jump (egress) to the appropriate table.
Red : drop.
Flow‑Meter Association
We move original jump or output actions into the port‑level meter. The flow’s final action becomes the meter, which then triggers the IP‑granular meter.
Meter Addition Commands
ovs-vsctl set interface vhuc0375f25-ce \
ingress_policing_rate=1000000 \
ingress_policing_burst=1000000
ovs-vsctl set port vhue8e32d21-a6 qos=@newqos \
-- --id=@newqos create qos type=egress-policer \
other-config:cir=1000000 other-config:cbs=1000000Meter Deletion and RCU Handling
When a limit is cleared, we set the rate to the maximum bandwidth instead of immediate deletion to avoid asynchronous state mismatches. Deletion proceeds through reference‑count (refcount) and RCU (Read‑Copy‑Update) phases: once refcount reaches zero, the meter enters a silent RCU period before final release. New references during this period prevent deletion.
ovs-vsctl set interface vhuc0375f25-ce \
ingress_policing_rate=0 \
ingress_policing_burst=0
ovs-vsctl set port vhue8e32d21-a6 qos=@newqos \
-- --id=@newqos create qos type=egress-policer \
other-config:cir=0 other-config:cbs=0Port and Flow Unref Logic
During VHU hot‑migration or deletion, iface_destroy__ triggers unref on both port‑level and IP‑level HQoS objects. Deleting a bridge also calls del-meter and iface_destroy__, so we guard against double unref by checking refcount before performing the operation.
Same‑Host Traffic Handling
Because hardware offload cannot simultaneously apply ingress meters on both dpdk0 and multiple VHU ports, same‑host traffic is forced through the software path for rate limiting. The flow parser detects VHU‑to‑VHU destinations and skips offload, applying software policing instead.
Conclusion
By leveraging OpenFlow, OVS native port‑level policing, and Mellanox HQoS meter hierarchy, we achieve fine‑grained IP‑to‑Port hardware‑offloaded rate limiting for virtual machine traffic, while handling deletion, RCU synchronization, and same‑host traffic constraints.
360 Zhihui Cloud Developer
360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
