Understanding Hardware Acceleration in Android Applications
Hardware acceleration in Android shifts intensive floating‑point UI work from the CPU to the GPU by building DisplayLists on the CPU and rasterizing them on the GPU, allowing parallel processing, selective redraw of unchanged elements, and significantly higher frame rates for animations and complex graphics.
In Android mobile app development, “hardware acceleration” is often mentioned, but many developers lack a clear understanding of its low‑level principles and how it affects UI rendering.
The article introduces hardware acceleration from the hardware level up to the Android 6.0 implementation, explaining why leveraging it can improve page performance.
Background of page rendering
Elements are ultimately converted to pixel matrices (e.g., Bitmaps) before being displayed.
Typical UI elements include circles, rounded rectangles, lines, text, vector graphics, Bitmaps, etc.
During drawing, especially animations, operations such as interpolation, scaling, rotation, alpha blending, blur, 3D transforms, physics, and media decoding involve large‑scale floating‑point calculations.
These calculations are logically simple but data‑intensive.
CPU vs GPU structure
CPU: complex control logic, few ALUs, good at serial logic, poor at heavy floating‑point math.
GPU: simple controller, many parallel ALUs, optimized for massive floating‑point operations.
Hardware acceleration works by translating graphics‑heavy tasks from CPU to GPU‑specific instructions.
Parallel example – Cascaded adder
Illustrates how a parallel circuit can sum eight integers in three clock cycles, compared with a serial CPU loop that needs seven additions and many cycles.
GPU parallel computing example
Adding 1 to every pixel of an image can be done by launching one GPU thread per pixel, showing the scalability of parallel execution.
Android hardware acceleration (Android 6.0)
Android UI is built from DisplayList objects that store element properties (position, size, rotation, alpha, etc.). The rendering pipeline is:
Canvas (Java API) → OpenGL (C/C++ library) → driver → GPU.
When hardware acceleration is enabled, the system creates a DisplayListCanvas instead of a regular Canvas. The method isHardwareAccelerated() determines which path is taken.
The drawing recursion (draw → onDraw → dispatchDraw → drawChild) invokes Canvas.drawXxx(). In software mode this performs actual drawing; in hardware‑accelerated mode it builds a DisplayList.
The DisplayList update path (
updateDisplayListIfDirty → dispatchGetDisplayList → recreateChildDisplayList) runs only when acceleration is on, allowing the framework to skip rebuilding unchanged parts.
After the DisplayList is ready, ThreadedRenderer.nSyncAndDrawFrame() sends it to the GPU for final rasterization.
Code example (excerpt from View class) shows how setAlpha() updates the RenderNode or falls back to invalidation when the view handles alpha itself:
public class View {
// ...
public void setAlpha(@FloatRange(from=0.0, to=1.0) float alpha) {
ensureTransformationInfo();
if (mTransformationInfo.mAlpha != alpha) {
mTransformationInfo.mAlpha = alpha;
if (onSetAlpha((int) (alpha * 255))) {
// ...
invalidate(true);
} else {
// ...
mRenderNode.setAlpha(getFinalAlpha());
// ...
}
}
}
protected boolean onSetAlpha(int alpha) { return false; }
// ...
}Software rendering refresh logic
clipChildren=true (default) limits redraw to the parent’s bounds; setting it false expands the dirty region.
When a view calls invalidate() without animation or layout, the framework marks dirty flags (PFLAG_DIRTY, PFLAG_DIRTY_OPAQUE) and decides whether parent or child needs repainting.
Opaque views only cause their own redraw; translucent views may propagate dirty flags up the hierarchy.
Summary
CPU excels at complex control flow; GPU excels at parallel floating‑point math.
UI rendering consists of many DisplayList elements that require heavy floating‑point work.
With hardware acceleration, the CPU builds/updates DisplayLists while the GPU performs the actual graphics computation.
During animations, only necessary DisplayLists are refreshed, greatly improving frame rates.
Choosing simpler DisplayList primitives (e.g., shapes instead of Bitmaps) yields better performance.
References and further reading are listed at the end of the original article.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
