Mobile Development 12 min read

Mastering Android Video Encoding: Choosing Encoders and Optimizing YUV Processing

This article examines Android video recording challenges, compares hardware (MediaCodec) and software (FFmpeg + x264/openh264) encoders, highlights device‑specific pitfalls such as color‑format support and alignment, and presents fast NEON‑based algorithms for scaling, rotation, and mirroring of YUV frames.

21CTO
21CTO
21CTO
Mastering Android Video Encoding: Choosing Encoders and Optimizing YUV Processing

Android video development is one of the most fragmented and compatibility‑problematic parts of the Android ecosystem. Recording a 540p MP4 typically follows the flow: camera outputs YUV frames → preprocessing (scaling, rotation, mirroring) → encoder → H.264 stream, then audio is recorded separately and finally multiplexed.

Choosing an Encoder

Two main options are used:

MediaCodec – hardware‑accelerated API introduced in API 16. It requires initializing a MediaCodec encoder, feeding raw YUV buffers via the input queue and retrieving encoded H.264 buffers from the output queue. It supports synchronous and asynchronous (callback) modes.

FFmpeg + x264/openh264 – software encoding. FFmpeg handles frame preprocessing, while x264 (or Cisco’s openh264) performs H.264 encoding.

MediaCodec pitfalls

Color‑format support varies by device. The encoder must be configured with a MediaFormat that specifies width, height, frame rate, bitrate, I‑frame interval, and the YUV color format. Many devices only accept NV21 or YUV420P; mismatched formats cause color distortion.

Feature support (profile, level, bitrate mode) is limited on most phones, especially below Android 7.0 where profiles are hard‑coded to Baseline, reducing compression efficiency.

Input dimensions must be 16‑pixel aligned; otherwise some SoCs produce corrupted video.

Software encoder characteristics

x264 offers the best performance and feature set among software encoders, while openh264 is free but supports only Baseline profile and limited multithreading.

YUV Frame Pre‑processing

Before encoding, YUV frames often need scaling, rotation, and mirroring.

Scaling – using FFmpeg’s sws_scale with SWS_FAST_BILINEAR is slow (≈40 ms per frame on Nexus 6P). A custom “local‑mean” algorithm implemented with NEON reduces scaling time to <5 ms and yields PSNR ≈ 38‑40 dB.

Rotation – rotating 960×540 YUV frames in pure C costs >30 ms per frame. Instead, the rotation matrix can be stored in the MP4 moov.trak.tkhd box, letting the player handle orientation.

Mirroring – front‑camera frames are horizontally flipped. A simple NEON‑based mirror that swaps rows for Y and UV planes processes a 1080×1920 frame in <5 ms.

After encoding the H.264 stream, audio and video are multiplexed into an MP4 file using MediaMuxer, mp4v2, or FFmpeg.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AndroidffmpegVideo EncodingNEON OptimizationHardware accelerationMediaCodecYUV Processing
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.