Why Does Hybrid Composition Crash with ‘No Surface’ Errors on Android Mini‑Programs?
This article analyzes a recurring crash in Android mini‑programs caused by a canvas synchronization issue in hybrid composition, detailing the stack traces, EGL surface creation failures, cross‑process SurfaceTexture lifecycle problems, and the concrete fix that dramatically reduced “no surface” crashes.
Problem Background
In Alipay Android mini‑program same‑layer rendering, crashes with the abort message drawRenderNode called on a context with no surface! were observed. The issue was reproduced in a transportation mini‑program and traced to three “no surface” related issues.
Problem Description
Crash Stack and Reproduction
The abort message appears on both Vulkan and non‑Vulkan devices, e.g.,
android::uirenderer::skiapipeline::SkiaVulkanPipeline::getFrame()followed by getFrame() called on a context with no surface! or drawRenderNode called on a context with no surface!. A full native stack is shown in the log.
pid: 31384 tid: 31499 name: RenderThread >>> com.eg.android.AlipayGphone <<<
signal 6 (SIGABRT) code -1 (SI_QUEUE) fault addr --------
Abort message: 'drawRenderNode called on a context with no surface!'
...Understanding Hybrid Composition
Hybrid composition (HC) merges content from different rendering pipelines into a single surface. It evolved from “hole‑punch” (1.0/2.0) to the current 3.0 stage, which truly composites native views into the host pipeline’s surface.
What Is Hybrid Composition?
It is a technique that unifies rendering from different contexts (WebView, Flutter, Weex, etc.) onto a common display surface. Any embedded WebView can be considered a hybrid composition scenario.
What Is Same‑Layer Rendering?
Same‑layer rendering is the mini‑program implementation of hybrid composition. Native components are drawn onto a surface that is then composited with Web content, giving the appearance of a single layer.
Root Cause Analysis
The abort originates from an EGLSurface creation failure. The Android Surface.isValid() check only verifies that the native object is non‑null, not that the underlying EGLSurface exists.
When eglCreateWindowSurface() is called, the system attempts to bind an ANativeWindow (derived from the android.view.Surface) to an EGLSurface. The log shows errors such as:
E libEGL : eglCreateWindowSurface: native_window_api_connect (win=0xb400006e71328010) failed (0xffffffed) (already connected to another API?)
E libEGL : eglCreateWindowSurfaceTmpl:835 error 3003 (EGL_BAD_ALLOC)
D HWUI : SkiaOpenGLPipeline::setSurface failed: this=0xb400006ed01e7540 surface=0xb400006e71328010 error=12291
E HWUI : window->query failed: No such device (-19) value=0The failure code 0xffffffed corresponds to ENODEV = "No such device", indicating that the consumer of the buffer (a SurfaceTexture) has already been destroyed.
Thus, the “no surface” message actually means the EGLSurface is null because its underlying SurfaceTexture consumer is gone.
Why Does the Surface Appear Valid?
Surface.isValid()only checks for a non‑null native pointer, not the lifecycle state of the consumer. In this scenario the consumer is destroyed while the producer still attempts to draw, leading to the abort.
Cross‑Process Surface Lifecycle
The native Surface is created in the GPU process (WebView kernel) and passed to the app process. If the app process is busy and does not process the destroy event from the render process in time, the GPU process may already have released the SurfaceTexture, causing the subsequent draw to fail.
Conclusion
The root cause is EGLSurface creation failure.
The direct cause is that the underlying SurfaceTexture (the consumer of the android.view.Surface) has been destroyed.
When the consumer is gone, every subsequent frame fails, so the system aborts instead of silently dropping frames.
Fix and Verification
Ensuring proper synchronization of the texture lifecycle across processes (e.g., adding locks or asynchronous waits) eliminates the crash. After a gray‑release, “no surface” crashes dropped by ~20,000 daily incidents, and overall stability improved by one point.
Extended Thoughts
The same issue appears in Flutter’s Platform View (Texture Layer Hybrid Composition). Understanding this Android case helps diagnose similar problems in other hybrid composition frameworks.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
