Mobile Development 10 min read

Understanding and Practicing Android MediaCodec for Audio/Video Encoding and Decoding

This article explains the fundamentals of Android MediaCodec, its advantages over software codecs, usage patterns, supported MIME types, practical implementations for video editing and live streaming, and presents a performance comparison showing lower CPU and memory consumption.

58 Tech
58 Tech
58 Tech
Understanding and Practicing Android MediaCodec for Audio/Video Encoding and Decoding

Audio and video raw data are extremely large, making storage and transmission difficult; compression through encoding (and corresponding decoding) is required to achieve maximum reduction with minimal loss, facilitating data transfer and storage.

Many of our projects—live streaming, short video, interactive live, video conferencing, one‑to‑one calls, recording, and playback—rely heavily on codec technology. However, supporting numerous formats (H.264, H.265, VP8, VP9, AAC, Opus, etc.) via external libraries inflates package size and degrades performance.

MediaCodec, provided by Google for Android developers and chipset manufacturers, offers a unified hardware‑accelerated interface that is fast, efficient, low‑CPU, low‑memory, and reduces app size, thereby solving the bloat and performance issues of software codecs.

MediaCodec Usage Tips

MediaCodec processes input data asynchronously using a set of input and output buffers. A typical workflow is: request an empty input buffer, fill it with data, queue it to the codec, wait for the codec to produce a filled output buffer, consume the output, and release the buffer back to the codec.

It can handle three kinds of data: encoded data, raw audio, and raw video. While all can be managed with ByteBuffer , using a Surface for raw video is strongly recommended because it operates on native video buffers without copying, yielding much higher performance.

For decoders, input buffers contain encoded data (e.g., a video frame or an audio segment). Audio buffers hold PCM samples (either 16‑bit integers or 32‑bit floats). Video buffers are defined by KEY_COLOR_FORMAT and may be native raw format, flexible YUV buffers, or other specific formats.

MediaCodec has three main states: stopped , executing , and released . The stopped state includes sub‑states configured , uninitialized , and error ; the executing state includes flushed , running , and end‑of‑stream .

Three static factory methods are provided:

createDecoderByType(String type)

createEncoderByType(String type)

createByCodecName(String name)

Partial list of supported MIME types includes:

video/x-vnd.on2.vp8 – VP8 video (WebM)

video/x-vnd.on2.vp9 – VP9 video (WebM)

video/avc – H.264/AVC video

video/hevc – H.265/HEVC video

video/mp4v-es – MPEG‑4 video

video/3gpp – H.263 video

audio/3gpp – AMR narrowband audio

audio/amr-wb – AMR wideband audio

audio/mpeg – MPEG‑1/2 Layer III

audio/mp4a-latm – AAC audio

audio/vorbis – Vorbis audio

audio/g711-alaw – G.711 A‑law audio

audio/g711-mlaw – G.711 µ‑law audio

Since Android Lollipop, the preferred approach is asynchronous processing: set a callback before calling configure , which changes the state transition behavior.

Technical Practice with MediaCodec

Video Editing Workflow

MediaCodec is often combined with MediaExtractor , MediaMuxer , MediaSync , MediaCrypto , MediaDrm , Image , Surface , and AudioTrack to implement comprehensive audio‑video features. A typical editing pipeline consists of:

Initialization

Extract encoded audio/video streams with MediaExtractor

Decode streams using a MediaCodec decoder

Process audio/video as needed

Encode processed streams with a MediaCodec encoder

Package the encoded streams using MediaMuxer (MP4, WebM, or 3GP output)

Release resources

Live Streaming (Push) Workflow

The live‑push pipeline includes image capture, audio capture, processing, image encoding, audio encoding, FLV packaging, and RTMP transmission. Hardware encoding for image and audio is performed via MediaCodec.

Capture audio and video frames

Process captured data

Encode video (H.264) and audio (AAC)

Package into FLV and transmit via RTMP

Encoding Performance Comparison

Scenario: 58 Video app live‑push using MediaCodec vs. OpenH264. Device: Samsung SM‑N9100, Android 6.0. Parameters: 544×960, 1500 kbps, 15 fps, 44.1 kHz, 16‑bit, stereo.

Results (average): MediaCodec – CPU 61.78 %, memory 944 MB; Software encoder – CPU 66.71 %, memory 1082 MB. MediaCodec shows lower and more stable CPU usage and significantly reduced memory consumption, especially when using Surface input.

Conclusion

MediaCodec is a crucial Android multimedia component; proper use enables playback, live streaming, video editing, recording, video calls, and conferencing with clear performance advantages over software codecs. However, it suffers from compatibility and stability issues across devices and OS versions, which can usually be mitigated through careful adaptation.

References:

MediaCodec documentation

MediaExtractor documentation

MediaMuxer documentation

AndroidMobileDevelopmentMediaCodecAudioEncodingVideoEncoding
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.