Mobile Development 20 min read

Why Does Hybrid Composition Crash with ‘No Surface’ Errors on Android Mini‑Programs?

This article analyzes a recurring crash in Android mini‑programs caused by a canvas synchronization issue in hybrid composition, detailing the stack traces, EGL surface creation failures, cross‑process SurfaceTexture lifecycle problems, and the concrete fix that dramatically reduced “no surface” crashes.

AntTech
AntTech
AntTech
Why Does Hybrid Composition Crash with ‘No Surface’ Errors on Android Mini‑Programs?

Problem Background

In Alipay Android mini‑program same‑layer rendering, crashes with the abort message drawRenderNode called on a context with no surface! were observed. The issue was reproduced in a transportation mini‑program and traced to three “no surface” related issues.

Problem Description

Crash Stack and Reproduction

The abort message appears on both Vulkan and non‑Vulkan devices, e.g.,

android::uirenderer::skiapipeline::SkiaVulkanPipeline::getFrame()

followed by getFrame() called on a context with no surface! or drawRenderNode called on a context with no surface!. A full native stack is shown in the log.

pid: 31384  tid: 31499  name: RenderThread  >>> com.eg.android.AlipayGphone <<<
signal 6 (SIGABRT)  code -1 (SI_QUEUE)  fault addr --------
Abort message: 'drawRenderNode called on a context with no surface!'
...

Understanding Hybrid Composition

Hybrid composition (HC) merges content from different rendering pipelines into a single surface. It evolved from “hole‑punch” (1.0/2.0) to the current 3.0 stage, which truly composites native views into the host pipeline’s surface.

What Is Hybrid Composition?

It is a technique that unifies rendering from different contexts (WebView, Flutter, Weex, etc.) onto a common display surface. Any embedded WebView can be considered a hybrid composition scenario.

What Is Same‑Layer Rendering?

Same‑layer rendering is the mini‑program implementation of hybrid composition. Native components are drawn onto a surface that is then composited with Web content, giving the appearance of a single layer.

Root Cause Analysis

The abort originates from an EGLSurface creation failure. The Android Surface.isValid() check only verifies that the native object is non‑null, not that the underlying EGLSurface exists.

When eglCreateWindowSurface() is called, the system attempts to bind an ANativeWindow (derived from the android.view.Surface) to an EGLSurface. The log shows errors such as:

E libEGL : eglCreateWindowSurface: native_window_api_connect (win=0xb400006e71328010) failed (0xffffffed) (already connected to another API?)
E libEGL : eglCreateWindowSurfaceTmpl:835 error 3003 (EGL_BAD_ALLOC)
D HWUI   : SkiaOpenGLPipeline::setSurface failed: this=0xb400006ed01e7540  surface=0xb400006e71328010  error=12291
E HWUI   : window->query failed: No such device (-19) value=0

The failure code 0xffffffed corresponds to ENODEV = "No such device", indicating that the consumer of the buffer (a SurfaceTexture) has already been destroyed.

Thus, the “no surface” message actually means the EGLSurface is null because its underlying SurfaceTexture consumer is gone.

Why Does the Surface Appear Valid?

Surface.isValid()

only checks for a non‑null native pointer, not the lifecycle state of the consumer. In this scenario the consumer is destroyed while the producer still attempts to draw, leading to the abort.

Cross‑Process Surface Lifecycle

The native Surface is created in the GPU process (WebView kernel) and passed to the app process. If the app process is busy and does not process the destroy event from the render process in time, the GPU process may already have released the SurfaceTexture, causing the subsequent draw to fail.

Conclusion

The root cause is EGLSurface creation failure.

The direct cause is that the underlying SurfaceTexture (the consumer of the android.view.Surface) has been destroyed.

When the consumer is gone, every subsequent frame fails, so the system aborts instead of silently dropping frames.

Fix and Verification

Ensuring proper synchronization of the texture lifecycle across processes (e.g., adding locks or asynchronous waits) eliminates the crash. After a gray‑release, “no surface” crashes dropped by ~20,000 daily incidents, and overall stability improved by one point.

Extended Thoughts

The same issue appears in Flutter’s Platform View (Texture Layer Hybrid Composition). Understanding this Android case helps diagnose similar problems in other hybrid composition frameworks.

crash analysisSurfacemini-programsHybrid CompositionEGLAndroid Rendering
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.