Optimizing H.265 in NetEase Cloud RTC: Capability Negotiation & Platform Benchmarks
This article explains how NetEase Cloud RTC implements H.265 video encoding, covering capability negotiation design, cross‑platform performance tests on Android, iOS, macOS and Windows, and engineering strategies such as whitelist policies, CPU overuse handling, QP threshold tuning, and the resulting quality and efficiency benefits.
Introduction
H.265 is the next‑generation video coding standard defined by ITU‑T VCEG after H.264, offering higher compression efficiency and better visual quality. NetEase Cloud RTC has conducted extensive engineering practice with H.265, and this article shares the experience.
Capability Negotiation
A client can send a stream only if both the sender and all receivers in the room support the corresponding codec; thus, the sender and receivers jointly determine the stream type. For H.265, a capability negotiation mechanism is required to ensure interoperability.
Capability Set Design
Capability sets are defined as { uint32 key : [ uint8 value1, uint8 value2 ... ] } using a 1‑bit mask. SDK keys range from 0 to 2^8‑1, video keys from 2^8 to 2^16‑1, and audio keys from 2^16 to 2^24‑1.
Capability Negotiation Flow (Client)
Client defines its local capability set.
Client reports the capability set and receives configuration from the server.
Server generates, aggregates, and distributes the room capability set.
Capability Negotiation Flow (Server)
When a room is created, a default capability set is generated by the engine.
The first edge_login request may override the default set if it contains a capability field.
Subsequent users compare their capability set with the room set; if larger, the room set stays unchanged, otherwise the intersection is computed and broadcast.
H.265 Codec Practice
Android
Tests on a Mi 10 (Qualcomm SM8250) at 720p 30 fps show that hardware H.265 decoding power consumption is comparable to H.264, while hardware H.265 bitrate is more stable. Software encoders (x265) perform poorly, and ffmpeg soft‑decode consumes ~15% CPU versus ~4.5% for libhevc.
Recommended strategy: prioritize hardware H.265 decoding; if hardware decoding fails, fall back to software decoding (prefer libhevc over ffmpeg). Prioritize hardware encoding, falling back to H.264 when compatibility issues arise.
iOS
Hardware H.265 encoding and decoding consume noticeably more power than H.264. Some devices (e.g., iPhone XR) exhibit bitrate insufficiency, prompting a fallback to H.264. When hardware encoding bitrate is stable, H.265 yields clearer images.
Strategy: prioritize hardware encoding, avoid software encoding; prioritize hardware decoding, falling back to ffmpeg only after repeated failures; switch to H.264 on low battery or when bitrate monitoring detects severe insufficiency.
Mac
On a 2016 MacBook Pro (i7‑6700HQ), hardware H.265 shows lower CPU usage than software H.265, with stable bitrate around the target. Hardware encoding replaces B‑frames with P‑frames, improving compression. Visual quality of hardware H.265 surpasses H.264, while software H.265 offers similar quality.
Strategy: prefer hardware decoding; if unsupported, use software decoding on powerful CPUs; prefer hardware encoding, falling back to software encoding on weaker machines.
Windows
Due to fragmented hardware support, the current approach relies on software encoding/decoding. On x86_64, CPU‑intensive software H.265 is enabled on high‑performance devices; on x86, H.265 is not supported.
Engineering Strategies
Whitelist Strategy
Devices with good H.265 hardware support are added to a whitelist; high‑performance devices receive software H.265 permissions based on benchmark scores, while low‑performance devices are excluded.
H.265 Capability Negotiation
The negotiated H.265 decoding capability influences whether H.265 encoding is enabled, based on user settings, device support, and server‑distributed capability sets.
CPU Overuse Strategy
For software H.265 encoding, continuous high encoding latency triggers a fallback to H.264; hardware encoding does not track per‑frame latency.
QP Threshold Adjustment
By aligning subjective quality between H.264 and H.265, QP curves are generated to derive upper and lower QP bounds. Experiments on a Mi 10 show that a QP range of [A‑1, B‑1] yields the best QoE.
Benefits
Compared with H.264, H.265 provides clear visual quality gains, especially at low bitrate or in challenging network conditions, while end‑to‑end latency, CPU usage, and smoothness remain comparable. Android hardware H.265 shows greater quality improvement than iOS.
Conclusion
Both hardware and software H.265 deliver noticeable quality improvements over H.264; future work will focus on further bitrate savings while maintaining visual quality.
NetEase Smart Enterprise Tech+
Get cutting-edge insights from NetEase's CTO, access the most valuable tech knowledge, and learn NetEase's latest best practices. NetEase Smart Enterprise Tech+ helps you grow from a thinker into a tech expert.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
