Mobile Development 15 min read

Mastering Android Video Encoding: Choosing Codecs and Optimizing YUV Processing

This article examines Android video encoding challenges, compares hardware (MediaCodec) and software (FFmpeg+x264/openh264) codecs, and presents fast YUV frame preprocessing techniques—including scaling, rotation, and mirroring—using NEON optimizations to achieve high‑quality, low‑latency recordings.

WeChat Client Technology Team

Aug 23, 2017

Mastering Android Video Encoding: Choosing Codecs and Optimizing YUV Processing

Android video development is one of the most fragmented and compatibility‑problematic areas in the Android ecosystem, especially the camera and video‑encoding APIs, which many consider among the hardest to use.

Video Encoder Selection

Most apps that record video need to process each frame individually, so they rarely use MediaRecorder. The two common choices are:

MediaCodec

FFmpeg + x264/openh264

MediaCodec

MediaCodec, introduced in API 16, provides low‑level access to hardware‑accelerated video encoding. After initializing MediaCodec as an encoder, raw YUV frames are fed into the input queue and encoded H.264 streams are retrieved from the output queue.

The API uses two queues: an input queue for raw YUV data and an output queue for the encoded H.264 stream. Both synchronous and asynchronous (callback) modes are available, with callbacks introduced in API 21.

Typical usage (synchronous) involves:

Calling getInputBuffers to obtain the input queue.

Using dequeueInputBuffer to get a free buffer index.

Submitting raw YUV data with queueInputBuffer.

Retrieving encoded data via getOutputBuffers and dequeueOutputBuffer.

Releasing the output buffer with releaseOutputBuffer.

More complex examples can be found in the Android CTS test EncodeDecodeTest.java.

Common issues with MediaCodec:

Color format compatibility : The encoder must be configured with a supported YUV color format. Many devices only support YUV420P, while the camera often outputs NV21. Developers must query supported formats via codecInfo.getCapabilitiesForType and convert NV21 to YUV420P when necessary.

Limited encoder feature support : Settings such as Profile (baseline, main, high), Level, and Bitrate mode (CBR, CQ, VBR) are often ignored on most devices, especially below Android 7.0, leading to lower video quality.

16‑pixel alignment requirement : H.264 macroblocks are 16×16. Encoding video dimensions that are not multiples of 16 (e.g., 960×540) can cause visual artifacts on some SoCs. Aligning width and height to 16‑pixel boundaries solves the problem.

FFmpeg + x264/openh264

Software encoding using FFmpeg for preprocessing and x264/openh264 for encoding is another popular approach.

x264 is widely regarded as the fastest commercial H.264 encoder with full feature support. openh264, released by Cisco, is free to use but supports fewer advanced features (only baseline profile, level 5.2, slice‑based multithreading).

Hardware vs. Software Encoding

Hardware encoding (MediaCodec) offers speed and no external dependencies but has limited feature support and lower compression efficiency. Software encoding (FFmpeg + x264) is slower but provides higher quality and richer H.264 features. On Android 4.4+, MediaCodec is generally reliable, but capabilities vary across devices, so choosing the encoder based on device performance is recommended.

YUV Frame Pre‑Processing

Before feeding frames to the encoder, YUV data often requires scaling, rotation, and mirroring.

Scaling

When the camera preview is 1080p but the target video is 960×540, real‑time scaling is needed. The common method uses FFmpeg's sws_scale with the SWS_FAST_BILINEAR algorithm, but on devices like Nexus 6P this can take >40 ms per frame, limiting frame rate.

To achieve sub‑5 ms scaling, a custom "local mean" algorithm implemented with NEON instructions was adopted. This method processes four neighboring pixels to compute each output pixel, dramatically reducing processing time while maintaining a PSNR of 38‑40 dB.

Rotation

Camera frames may be rotated 90° or 270° depending on device orientation. Instead of rotating YUV data (which can cost >30 ms per frame), the rotation matrix can be written into the MP4 moov.trak.tkhd box using FFmpeg, allowing the player to handle rotation without extra CPU work.

Mirroring

Front‑camera footage is often mirrored. Since the raw YUV frame is already horizontally flipped, mirroring can be performed by swapping rows around the vertical center, handling Y and UV planes separately. A NEON implementation processes a 1080×1920 frame in under 5 ms.

Final Muxing

After encoding the H.264 video stream, the audio and video streams are multiplexed into an MP4 container using MediaMuxer, mp4v2, or FFmpeg.

References

Leixiaohua's blog: http://blog.csdn.net/leixiaohua1020

Android MediaCodec resources: http://bigflake.com/mediacodec/

Coding for NEON tutorial: https://community.arm.com/processors/b/blog/posts/coding-for-neon---part-1-load-and-stores

libyuv library: https://chromium.googlesource.com/libyuv/libyuv/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Android FFmpeg Video Encoding NEON Optimization MediaCodec YUV Processing

Written by

WeChat Client Technology Team

Official account of the WeChat mobile client development team, sharing development experience, cutting‑edge tech, and little‑known stories across Android, iOS, macOS, Windows Phone, and Windows.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.