Game Development 13 min read

Improving VR Video Clarity: PPD, Tile Encoding, and Future Directions

VR video clarity suffers because the required pixels‑per‑degree far exceed what 4K or 8K spherical streams can deliver, but tile‑based encoding that decodes only the viewport, combined with low motion‑to‑photon latency, distortion control, advanced codecs and AI‑driven projection, promises sharper, lower‑bitrate 6DoF experiences.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Improving VR Video Clarity: PPD, Tile Encoding, and Future Directions

VR content clarity has long been a focus for enhancing user immersion. Many users complain that even when watching 4K or 8K VR videos, the perceived quality is often worse than a 1080p phone screen, raising questions about the authenticity of the hardware and the source content.

Why is VR video clarity insufficient? The key metric is Pixels Per Degree (PPD), which measures how many pixels the eye can resolve per visual angle. For a typical smartphone with a retina display held 30 cm away, the PPD is about 60, meeting the retina threshold. However, in VR the user views only a small portion of a spherical video, effectively magnifying the required resolution and causing PPD to drop dramatically.

In traditional 2D screens, a 4K resolution often provides a satisfactory experience because the entire screen is visible. In contrast, VR users see only a viewport (a small “window” on the sphere), so a 4K video distributed over the whole sphere may deliver only ~1K×1K resolution within the viewport, leading to noticeable blur.

Tile‑based encoding solution : Instead of decoding the whole 8K spherical frame, the video is divided into many rectangular tiles (e.g., 8×8 or 4×4). Only the tiles intersecting the current viewport are decoded and rendered. This reduces decoding load, allows parallel CPU‑GPU scheduling, and can achieve the required PPD with far fewer decoders.

When the user rotates their head, newly visible tiles are decoded on‑the‑fly, while a low‑resolution fallback stream (e.g., 1080p or 2K) is continuously decoded to avoid black‑screen artifacts during rapid motion.

Additional quality factors include Motion‑to‑Photon (MTP) latency, which must stay below ~20 ms to prevent motion sickness, and lens distortion (radial and tangential) that should be limited to about 1 % error. The number of degrees of freedom (DoF) also impacts immersion: 0DoF (static video), 3DoF (360° head rotation only), and 6DoF (full positional tracking) each require different streaming strategies.

Standards such as the Chinese “5G High‑Definition Video – VR Video Technology White Paper (2020)” recommend H.265/AVS2 for 8K VR, with future adoption of H.266/AVS3 to halve bitrate. Projection methods (e.g., cube map, pyramid) can further reduce redundant pixels compared to equirectangular projection.

Future directions highlighted are: lowering bitrate via smarter projection and AI‑driven viewport prediction, optimizing decoding pipelines with mixed CPU‑GPU workloads, adaptive tile sizing (larger tiles for polar regions), and expanding from 3DoF to 6DoF experiences.

Streaminglatencyvideo qualityVR8KPPDTile Encoding
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.