Mobile Development 10 min read

Zero‑Second Startup: iQIYI Playback Kernel Performance Optimization and 5.0 Architecture

iQIYI’s new Playback Kernel 5.0 introduces a decoupled pre‑decode component that creates a single hardware (or software) decoder and supplies pre‑decoded frames to multiple player instances, cutting start‑up latency from roughly 400 ms to about 35 ms and enabling true “zero‑second” playback across a wide range of Android devices.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Zero‑Second Startup: iQIYI Playback Kernel Performance Optimization and 5.0 Architecture

Background : iQIYI’s large‑playback core runs on Android Mobile, Android TV, Apple TV, iPhone, iPad, GPad, macOS, Windows PC and supports live, VOD, ads, membership, VR/AR, interactive video, etc. After nine years and four major versions, the core is stable for long‑video playback, but new “scroll‑to‑play” scenarios (short‑video feeds, rapid switching) expose performance limits.

Problem : To achieve “zero‑second” start‑up, the product requires 2‑3 player instances for pre‑loading and instance switching. While this solves latency, it dramatically increases memory and thread usage, causing noticeable stutter on mid‑ and low‑end devices.

Investigation : The issue is most severe on Android devices with diverse hardware capabilities. Tests show that the dominant latency comes from decoder creation and opening, ranging from ~20 ms on high‑end phones to >350 ms on low‑end phones (some >500 ms).

Proposed Solutions :

Multi‑decoder scheme: create several decoders inside the core and pre‑decode, reducing memory/threads compared with multiple player instances. Works on mid‑high devices but still problematic on low‑end.

Software‑decode scheme: decoder creation <20 ms, but CPU usage and power consumption become high, especially for high‑bitrate streams.

Live‑playback architecture: open a decoder once and reuse it, leveraging Adaptive Playback on Android. Simplifies decoder creation but complicates timeline management for VOD, seeking, ads, etc.

All three have advantages and drawbacks; the final approach merges them.

Solution – Playback Kernel 5.0 : The new architecture adds a “pre‑decode” unit that is decoupled from the player instance. The pre‑decode unit creates a hardware decoder once (fallback to software if needed) and provides pre‑decoded frames to any player instance, bypassing the most time‑consuming decoding stage. This enables “zero‑second” start‑up while keeping API compatibility.

Architecture diagram (originally shown in the source) is omitted here, but the key change is the independent pre‑decode component.

Performance Test (Qualcomm Snapdragon 450) :

Comparison of total, business‑logic and decode‑render times (ms):

Total Time

Business Time

Decode/Render Time

4.0

395.85

72.30

323.55

5.0

35.14

24.21

10.93

The 5.0 version reduces total latency from ~396 ms to ~35 ms, meeting the “zero‑second” goal (sub‑50 ms on mid‑low devices, sub‑20 ms on high‑end).

Conclusion : By integrating multi‑decoder, software‑decode fallback, and live‑playback concepts into a unified pre‑decode architecture, iQIYI achieved a dramatic performance boost across a wide range of devices. The next steps will build on version 5.0 to deliver further innovations.

Mobile DevelopmentPerformance OptimizationAndroidvideo playbackcodecpre-decodingzero-second startup
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.