Understanding and Practicing Android MediaCodec for Audio/Video Encoding and Decoding
This article explains the fundamentals of Android MediaCodec, its advantages over software codecs, usage patterns, supported MIME types, practical implementations for video editing and live streaming, and presents a performance comparison showing lower CPU and memory consumption.
Audio and video raw data are extremely large, making storage and transmission difficult; compression through encoding (and corresponding decoding) is required to achieve maximum reduction with minimal loss, facilitating data transfer and storage.
Many of our projects—live streaming, short video, interactive live, video conferencing, one‑to‑one calls, recording, and playback—rely heavily on codec technology. However, supporting numerous formats (H.264, H.265, VP8, VP9, AAC, Opus, etc.) via external libraries inflates package size and degrades performance.
MediaCodec, provided by Google for Android developers and chipset manufacturers, offers a unified hardware‑accelerated interface that is fast, efficient, low‑CPU, low‑memory, and reduces app size, thereby solving the bloat and performance issues of software codecs.
MediaCodec Usage Tips
MediaCodec processes input data asynchronously using a set of input and output buffers. A typical workflow is: request an empty input buffer, fill it with data, queue it to the codec, wait for the codec to produce a filled output buffer, consume the output, and release the buffer back to the codec.
It can handle three kinds of data: encoded data, raw audio, and raw video. While all can be managed with ByteBuffer , using a Surface for raw video is strongly recommended because it operates on native video buffers without copying, yielding much higher performance.
For decoders, input buffers contain encoded data (e.g., a video frame or an audio segment). Audio buffers hold PCM samples (either 16‑bit integers or 32‑bit floats). Video buffers are defined by KEY_COLOR_FORMAT and may be native raw format, flexible YUV buffers, or other specific formats.
MediaCodec has three main states: stopped , executing , and released . The stopped state includes sub‑states configured , uninitialized , and error ; the executing state includes flushed , running , and end‑of‑stream .
Three static factory methods are provided:
createDecoderByType(String type)
createEncoderByType(String type)
createByCodecName(String name)
Partial list of supported MIME types includes:
video/x-vnd.on2.vp8 – VP8 video (WebM)
video/x-vnd.on2.vp9 – VP9 video (WebM)
video/avc – H.264/AVC video
video/hevc – H.265/HEVC video
video/mp4v-es – MPEG‑4 video
video/3gpp – H.263 video
audio/3gpp – AMR narrowband audio
audio/amr-wb – AMR wideband audio
audio/mpeg – MPEG‑1/2 Layer III
audio/mp4a-latm – AAC audio
audio/vorbis – Vorbis audio
audio/g711-alaw – G.711 A‑law audio
audio/g711-mlaw – G.711 µ‑law audio
Since Android Lollipop, the preferred approach is asynchronous processing: set a callback before calling configure , which changes the state transition behavior.
Technical Practice with MediaCodec
Video Editing Workflow
MediaCodec is often combined with MediaExtractor , MediaMuxer , MediaSync , MediaCrypto , MediaDrm , Image , Surface , and AudioTrack to implement comprehensive audio‑video features. A typical editing pipeline consists of:
Initialization
Extract encoded audio/video streams with MediaExtractor
Decode streams using a MediaCodec decoder
Process audio/video as needed
Encode processed streams with a MediaCodec encoder
Package the encoded streams using MediaMuxer (MP4, WebM, or 3GP output)
Release resources
Live Streaming (Push) Workflow
The live‑push pipeline includes image capture, audio capture, processing, image encoding, audio encoding, FLV packaging, and RTMP transmission. Hardware encoding for image and audio is performed via MediaCodec.
Capture audio and video frames
Process captured data
Encode video (H.264) and audio (AAC)
Package into FLV and transmit via RTMP
Encoding Performance Comparison
Scenario: 58 Video app live‑push using MediaCodec vs. OpenH264. Device: Samsung SM‑N9100, Android 6.0. Parameters: 544×960, 1500 kbps, 15 fps, 44.1 kHz, 16‑bit, stereo.
Results (average): MediaCodec – CPU 61.78 %, memory 944 MB; Software encoder – CPU 66.71 %, memory 1082 MB. MediaCodec shows lower and more stable CPU usage and significantly reduced memory consumption, especially when using Surface input.
Conclusion
MediaCodec is a crucial Android multimedia component; proper use enables playback, live streaming, video editing, recording, video calls, and conferencing with clear performance advantages over software codecs. However, it suffers from compatibility and stability issues across devices and OS versions, which can usually be mitigated through careful adaptation.
References:
MediaCodec documentation
MediaExtractor documentation
MediaMuxer documentation
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.