Frontend Development 18 min read

How Alipay’s AR ‘Air Writing’ Brings Fortune Characters to Life in Mini‑Games

This article details how Alipay’s 2022 ‘Air Writing’ AR feature was built for the Five‑Fortune project, covering the technical challenges, the front‑end stack—including Vite, Oasis Engine, ARSession, and WebGL shaders—implementation steps for drawing, vectorizing, extruding characters, playback handling, and memory‑performance optimizations.

Alipay Experience Technology
Alipay Experience Technology
Alipay Experience Technology
How Alipay’s AR ‘Air Writing’ Brings Fortune Characters to Life in Mini‑Games

Background

AR technology has become popular in games, medical, transportation, fashion, and education, but its use in marketing interaction is still limited. The Alipay Five‑Fortune project introduced an AR "air writing" feature to let users create and place fortune characters in real space.

Technical Constraints

Hardware limits: H5‑based interactive marketing relies on WebXR, which is immature; only ARKit/ARCore are available.

High development cost: requires coordination of front‑end, client, and algorithm teams.

Performance bottlenecks: AR and AI workloads heavily load CPU and GPU.

Technology Stack

Alipay Mini‑Game Container : provides rendering via a wrapped OpenGL API, exposing native capabilities through JSAPI. It does not implement a DOM, so GUI is drawn with WebGL.

ARSession : abstracts AR capabilities from ARCore/ARKit and device sensors.

Oasis Engine : a game engine that simplifies 3D development and bridges H5 APIs to the mini‑game container.

Development Process

1. Create Front‑End Scaffold

Vite is used for fast transpiling and bundling. A custom mini‑game scaffold supports TypeScript and NPM packages (see

create‑oasis‑app

).

2. Adapt to Mini‑Game Container

The adapter normalizes API differences, allowing code such as

new Image()

to work on both web and mini‑program environments.

3. Render Camera Background

ARSession supplies per‑frame camera data. A full‑screen plane is created and a YUV‑to‑RGB shader converts the camera texture to RGB for display.

const yuv_vs = `
attribute vec3 POSITION;
attribute vec2 TEXCOORD_0;

uniform mat4 u_uvMatrix;
varying vec2 v_uv;

void main() {
  vec2 flipUV = TEXCOORD_0;
  flipUV.y = 1.0 - flipUV.y;
  v_uv = (u_uvMatrix * vec4(flipUV, 1.0, 1.0)).xy;
  gl_Position = vec4( POSITION.xy, 1.0, 1.0);
}
`;

const yuv_fs_Android = `
uniform sampler2D u_frameY;
uniform sampler2D u_frameUV;
varying vec2 v_uv;

void main() {
  float y = texture2D(u_frameY, v_uv).a;
  vec4 uvColor = texture2D(u_frameUV, v_uv);
  float u = uvColor.a - 0.5;
  float v = uvColor.r - 0.5;

  float r = y + 1.13983 * v;
  float g = y - 0.39465 * u - 0.58060 * v;
  float b = y + 2.03211 * u;

  gl_FragColor = vec4(r, g, b, 1.0);
}
`;

const yuv_fs_iOS = `
uniform sampler2D u_frameY;
uniform sampler2D u_frameUV;
varying vec2 v_uv;

void main() {
  float y = texture2D(u_frameY, v_uv).a;
  vec4 uvColor = texture2D(u_frameUV, v_uv);
  float u = uvColor.r - 0.5;
  float v = uvColor.a - 0.5;

  float r = y + 1.04 * v;
  float g = y - 0.343 * u - 0.711 * v;
  float b = y + 1.765 * u;

  gl_FragColor = vec4(r, g, b, 1.0);
}
`;

Each frame updates the

u_uvMatrix

,

u_frameY

, and

u_frameUV

uniforms via the

onARFrame

callback.

myARSession.onARFrame(arframe => {
  updateBackgroundScene(arframe);
});
function updateBackgroundScene(arframe) {
  const w = arframe.width;
  const h = arframe.height;
  const len = w * h;
  if (len <= 0) return;

  // Update u_uvMatrix
  if (arframe.capturedImageMatrix) {
    const matrix = bgMaterial.shaderData.getMatrix("u_uvMatrix");
    matrix.setValueByArray(arframe.capturedImageMatrix);
  }

  // Update textures
  if (arframe.capturedImage) {
    const cameraFrame = arframe.capturedImage;
    let textureFrameY = bgMaterial.shaderData.getTexture("u_frameY");
    let textureFrameUV = bgMaterial.shaderData.getTexture("u_frameUV");

    if (!textureFrameY) {
      textureFrameY = new Texture2D(engine, w, h, TextureFormat.Alpha8, false);
      textureFrameY.wrapModeU = textureFrameY.wrapModeV = TextureWrapMode.Clamp;
      bgMaterial.shaderData.setTexture("u_frameY", textureFrameY);
    }
    if (!textureFrameUV) {
      textureFrameUV = new Texture2D(engine, w / 2, h / 2, TextureFormat.LuminanceAlpha, false);
      textureFrameUV.wrapModeU = textureFrameUV.wrapModeV = TextureWrapMode.Clamp;
      bgMaterial.shaderData.setTexture("u_frameUV", textureFrameUV);
    }
    textureFrameY.setPixelBuffer(new Uint8Array(cameraFrame, 0, len));
    textureFrameUV.setPixelBuffer(new Uint8Array(cameraFrame, len));
  }
}

4. Implement Air Writing

The workflow consists of two chains: writing and playback. The writing chain uses a 2D vectorized brush, extrudes the geometry, and adds stickers (glTF models) with a custom

DragComponent

handling pointer events.

Brush to Vector

We ported the

shodo

library to the mini‑game container for brush rendering, then used

potrace

(C++ version called via JSAPI) to convert bitmap strokes to vector paths.

Stickers

Stickers are glTF models loaded at runtime; each sticker receives a

DragComponent

that implements

onPointerDown

and

onPointerDrag

for interactive placement.

Transition Effects

After 2D drawing, the character transitions to 3D using an orthographic‑to‑perspective interpolation, keeping size consistent while applying a Lottie animation for visual flair.

Playback Implementation

Stroke data is recorded as a three‑level array (character → stroke → point) together with brush metadata and canvas size. Example structure:

{
  char: [
    [[x, y, timestamp], ...],
    ...
  ],
  brush: { icon, extInfo },
  canvas2D: { width, height }
}

Sticker data is stored as an array of

[url, x, y]

entries.

Data Thinning

Points closer than 10 pixels are removed to keep the payload under ~10 KB.

for (let i = 0; i < charData.length; i++) {
  const originalStroke = charData[i];
  let last = null;
  charData[i] = [];
  for (let j = 0; j < originalStroke.length; j++) {
    const point = originalStroke[j];
    if (last) {
      const l = length(point[0], point[1], last[0], last[1]);
      if (l < 10 && j !== originalStroke.length - 1) continue;
    }
    charData[i].push(point);
    last = point;
  }
}

Memory & Performance Optimizations

Peak memory should stay below 200 MB. Optimizations include:

Reducing main canvas resolution based on device capability (scale factors 0.8, 0.6, 0.5).

Lowering AR camera resolution to match the canvas.

Optimizing request handling: bypass base64 conversion by creating

ArrayBuffer

directly in C++ and exposing it via JSBinding.

Canvas‑to‑GPU upload: use dirty‑flag checks to avoid unnecessary texture uploads.

Video recording resolution: limit to 720p or 540p depending on device.

Conclusion

The AR air‑writing feature attracted many users and demonstrated a complete pipeline from brush input to 3D rendering, playback, and efficient resource usage. Ongoing work will continue to refine the AR toolchain and explore new interactive experiences.

Performance Optimizationfrontend developmentARmini-programWebGLOasis engine
Alipay Experience Technology
Written by

Alipay Experience Technology

Exploring ultimate user experience and best engineering practices

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.