Mobile Development 21 min read

Why Does Android’s RenderThread Crash on TextureView.getBitmap? A Deep Dive and Fix

This article investigates the Android 5‑6 RenderThread native crash caused by a missing EGL surface when TextureView.getBitmap is called before ThreadedRender initialization, analyzes the root cause through code inspection and runtime logs, and presents a bytecode‑instrumentation fix that dramatically reduces the crash rate.

Watermelon Video Tech Team
Watermelon Video Tech Team
Watermelon Video Tech Team
Why Does Android’s RenderThread Crash on TextureView.getBitmap? A Deep Dive and Fix

Background

Earlier versions of the Xigua video app experienced frequent RenderThread crashes on Android 5‑6, appearing as immediate crashes when opening a live room. The issue ranked among the top native crashes in 2022 and became the number‑one crash in 2023, prompting a thorough source‑code investigation.

Basic Information

The crash stack consists solely of system .so calls, indicating a RenderThread abort. The abort originates from CanvasContext::requireSurface:

The abort occurs because mEglSurface is EGL_NO_SURFACE.

RenderThread Overview

RenderThread is a singleton thread that continuously processes RenderTask objects from a task queue. The Java side communicates via ThreadedRender, which creates a native RenderProxy. RenderProxy wraps bridge functions into RenderTask and posts them to RenderThread.

Initial Hypothesis

Similar crashes in other apps were caused by a side‑slide framework releasing a TextureView surface, but Xigua’s side‑slide code does not exhibit that behavior, so the cause must be different.

Direct Analysis of the Crash Condition

The abort is triggered when mEglSurface == EGL_NO_SURFACE. There are two places where mEglSurface can be set:

void CanvasContext::setSurface(ANativeWindow* window) {
    if (mEglSurface != EGL_NO_SURFACE) {
        mEglSurface = EGL_NO_SURFACE;
    }
    if (window) {
        // cannot return EGL_NO_SURFACE
        mEglSurface = mEglManager.createSurface(window);
    }
}

EGLSurface EglManager::createSurface(EGLNativeWindowType window) {
    EGLSurface surface = eglCreateWindowSurface(mEglDisplay, mEglConfig, window, nullptr);
    LOG_ALWAYS_FATAL_IF(surface == EGL_NO_SURFACE, "Failed to create EGLSurface for window %p, eglErr = %s", (void*)window, egl_error_str());
    return surface;
}

Two scenarios lead to EGL_NO_SURFACE:

Calling setSurface(nullptr) sets mEglSurface to EGL_NO_SURFACE, after which requireSurface aborts.

Calling setSurface(window) first sets EGL_NO_SURFACE; if another thread invokes requireSurface before createSurface returns, abort occurs.

Additionally, the initial value of mEglSurface is EGL_NO_SURFACE, so invoking requireSurface before any setSurface also crashes.

Why Android 7+ Avoids the Issue

In Android 7.0 the requireSurface method was removed (commit 8afcc769). The new code calls EglManager::initialize() instead, which does not abort when the surface is missing.

Attempted Hooking of requireSurface

Hooking the native requireSurface on Android 6 proved difficult because the symbol is inlined into createTextureLayer and many other changes accompany the Android‑7 fix.

Systematic Analysis of Possible Causes

Multithreading?

RenderThread is a singleton; all rendering tasks are serialized, so true concurrent access is ruled out.

setSurface(null) Path

ThreadedRender’s lifecycle methods ( initialize, draw, createTextureLayer, updateSurface, destroy) were examined. initialize guarantees a non‑null surface, and draw aborts on EGL errors, which were not observed in logs. destroy leads to setSurface(nullptr), but subsequent requireSurface would cause a SIGSEGV, not the observed “surface not set” abort.

Premature requireSurface Call

Tracing the Java call chain shows that TextureView.getBitmap ultimately invokes CanvasContext::requireSurface. If TextureView.getBitmap is called before ThreadedRender.initialize, mEglSurface is still EGL_NO_SURFACE, causing the abort.

Online Log Evidence

Production logs captured a stack where LivePlayerWidget.loadSharedPlayer called TextureView.getBitmap during onMeasure, and the corresponding RenderProxy instance had never been initialized. This confirms the premature call hypothesis.

Root Cause

During the first performTraversals (when mFirst=true), onMeasure runs before ThreadedRender.initialize. Business code that calls TextureView.getBitmap in onMeasure triggers requireSurface while mEglSurface is still EGL_NO_SURFACE, leading to the crash on Android 5‑6.

Fix Implementation

The solution replaces all global calls to TextureView.getBitmap with a safe wrapper that returns null when the view has not been laid out (i.e., isLaidOut() is false) or when the crash‑prevention feature is disabled. Bytecode instrumentation inserts the following logic:

public static boolean isGetBitmapSafe(TextureView textureView) {
    return Build.VERSION.SDK_INT > 23 || textureView.isLaidOut() || !AppSettings.inst().mFerretSettings.autoFixRequireSurface.enable();
}

@ReplaceMethodInvoke(targetClass = TextureView.class, methodName = "getBitmap", includeOverride = true)
public static Bitmap getBitmapHook(TextureView textureView) {
    return isGetBitmapSafe(textureView) ? textureView.getBitmap() : null;
}

@ReplaceMethodInvoke(targetClass = TextureView.class, methodName = "getBitmap", includeOverride = true)
public static Bitmap getBitmapHook(TextureView textureView, int width, int height) {
    return isGetBitmapSafe(textureView) ? textureView.getBitmap(width, height) : null;
}

@ReplaceMethodInvoke(targetClass = TextureView.class, methodName = "getBitmap", includeOverride = true)
public static Bitmap getBitmapHook(TextureView textureView, Bitmap bitmap) {
    return isGetBitmapSafe(textureView) ? textureView.getBitmap(bitmap) : bitmap;
}

This approach avoids the expensive reflection of mFirst by using the public isLaidOut() flag, which has equivalent semantics.

Fix Results

After a full rollout, crashes related to requireSurface dropped dramatically, with no noticeable impact on performance or user experience. Business metrics such as live‑stream view time increased, confirming the stability and revenue benefits.

Future Considerations

When RenderThread crashes, capturing the main‑thread Java stack alongside the native stack would pinpoint the business code responsible, reducing the need for extensive manual tracing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceAndroidCrashAnalysisHookingTextureViewRenderThreadNativeCrash
Watermelon Video Tech Team
Written by

Watermelon Video Tech Team

Technical practice sharing from Watermelon Video

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.