How Alipay’s AR ‘Air Writing’ Brings Fortune Characters to Life in Mini‑Games

This article details how Alipay’s 2022 ‘Air Writing’ AR feature was built for the Five‑Fortune project, covering the technical challenges, the front‑end stack—including Vite, Oasis Engine, ARSession, and WebGL shaders—implementation steps for drawing, vectorizing, extruding characters, playback handling, and memory‑performance optimizations.

Alipay Experience Technology
Alipay Experience Technology
Alipay Experience Technology
How Alipay’s AR ‘Air Writing’ Brings Fortune Characters to Life in Mini‑Games

Background

AR technology has become popular in games, medical, transportation, fashion, and education, but its use in marketing interaction is still limited. The Alipay Five‑Fortune project introduced an AR "air writing" feature to let users create and place fortune characters in real space.

Technical Constraints

Hardware limits: H5‑based interactive marketing relies on WebXR, which is immature; only ARKit/ARCore are available.

High development cost: requires coordination of front‑end, client, and algorithm teams.

Performance bottlenecks: AR and AI workloads heavily load CPU and GPU.

Technology Stack

Alipay Mini‑Game Container : provides rendering via a wrapped OpenGL API, exposing native capabilities through JSAPI. It does not implement a DOM, so GUI is drawn with WebGL.

ARSession : abstracts AR capabilities from ARCore/ARKit and device sensors.

Oasis Engine : a game engine that simplifies 3D development and bridges H5 APIs to the mini‑game container.

Development Process

1. Create Front‑End Scaffold

Vite is used for fast transpiling and bundling. A custom mini‑game scaffold supports TypeScript and NPM packages (see create‑oasis‑app).

2. Adapt to Mini‑Game Container

The adapter normalizes API differences, allowing code such as new Image() to work on both web and mini‑program environments.

3. Render Camera Background

ARSession supplies per‑frame camera data. A full‑screen plane is created and a YUV‑to‑RGB shader converts the camera texture to RGB for display.

const yuv_vs = `
attribute vec3 POSITION;
attribute vec2 TEXCOORD_0;

uniform mat4 u_uvMatrix;
varying vec2 v_uv;

void main() {
  vec2 flipUV = TEXCOORD_0;
  flipUV.y = 1.0 - flipUV.y;
  v_uv = (u_uvMatrix * vec4(flipUV, 1.0, 1.0)).xy;
  gl_Position = vec4( POSITION.xy, 1.0, 1.0);
}
`;

const yuv_fs_Android = `
uniform sampler2D u_frameY;
uniform sampler2D u_frameUV;
varying vec2 v_uv;

void main() {
  float y = texture2D(u_frameY, v_uv).a;
  vec4 uvColor = texture2D(u_frameUV, v_uv);
  float u = uvColor.a - 0.5;
  float v = uvColor.r - 0.5;

  float r = y + 1.13983 * v;
  float g = y - 0.39465 * u - 0.58060 * v;
  float b = y + 2.03211 * u;

  gl_FragColor = vec4(r, g, b, 1.0);
}
`;

const yuv_fs_iOS = `
uniform sampler2D u_frameY;
uniform sampler2D u_frameUV;
varying vec2 v_uv;

void main() {
  float y = texture2D(u_frameY, v_uv).a;
  vec4 uvColor = texture2D(u_frameUV, v_uv);
  float u = uvColor.r - 0.5;
  float v = uvColor.a - 0.5;

  float r = y + 1.04 * v;
  float g = y - 0.343 * u - 0.711 * v;
  float b = y + 1.765 * u;

  gl_FragColor = vec4(r, g, b, 1.0);
}
`;

Each frame updates the u_uvMatrix, u_frameY, and u_frameUV uniforms via the onARFrame callback.

myARSession.onARFrame(arframe => {
  updateBackgroundScene(arframe);
});
function updateBackgroundScene(arframe) {
  const w = arframe.width;
  const h = arframe.height;
  const len = w * h;
  if (len <= 0) return;

  // Update u_uvMatrix
  if (arframe.capturedImageMatrix) {
    const matrix = bgMaterial.shaderData.getMatrix("u_uvMatrix");
    matrix.setValueByArray(arframe.capturedImageMatrix);
  }

  // Update textures
  if (arframe.capturedImage) {
    const cameraFrame = arframe.capturedImage;
    let textureFrameY = bgMaterial.shaderData.getTexture("u_frameY");
    let textureFrameUV = bgMaterial.shaderData.getTexture("u_frameUV");

    if (!textureFrameY) {
      textureFrameY = new Texture2D(engine, w, h, TextureFormat.Alpha8, false);
      textureFrameY.wrapModeU = textureFrameY.wrapModeV = TextureWrapMode.Clamp;
      bgMaterial.shaderData.setTexture("u_frameY", textureFrameY);
    }
    if (!textureFrameUV) {
      textureFrameUV = new Texture2D(engine, w / 2, h / 2, TextureFormat.LuminanceAlpha, false);
      textureFrameUV.wrapModeU = textureFrameUV.wrapModeV = TextureWrapMode.Clamp;
      bgMaterial.shaderData.setTexture("u_frameUV", textureFrameUV);
    }
    textureFrameY.setPixelBuffer(new Uint8Array(cameraFrame, 0, len));
    textureFrameUV.setPixelBuffer(new Uint8Array(cameraFrame, len));
  }
}

4. Implement Air Writing

The workflow consists of two chains: writing and playback. The writing chain uses a 2D vectorized brush, extrudes the geometry, and adds stickers (glTF models) with a custom DragComponent handling pointer events.

Brush to Vector

We ported the shodo library to the mini‑game container for brush rendering, then used potrace (C++ version called via JSAPI) to convert bitmap strokes to vector paths.

Stickers

Stickers are glTF models loaded at runtime; each sticker receives a DragComponent that implements onPointerDown and onPointerDrag for interactive placement.

Transition Effects

After 2D drawing, the character transitions to 3D using an orthographic‑to‑perspective interpolation, keeping size consistent while applying a Lottie animation for visual flair.

Playback Implementation

Stroke data is recorded as a three‑level array (character → stroke → point) together with brush metadata and canvas size. Example structure:

{
  char: [
    [[x, y, timestamp], ...],
    ...
  ],
  brush: { icon, extInfo },
  canvas2D: { width, height }
}

Sticker data is stored as an array of [url, x, y] entries.

Data Thinning

Points closer than 10 pixels are removed to keep the payload under ~10 KB.

for (let i = 0; i < charData.length; i++) {
  const originalStroke = charData[i];
  let last = null;
  charData[i] = [];
  for (let j = 0; j < originalStroke.length; j++) {
    const point = originalStroke[j];
    if (last) {
      const l = length(point[0], point[1], last[0], last[1]);
      if (l < 10 && j !== originalStroke.length - 1) continue;
    }
    charData[i].push(point);
    last = point;
  }
}

Memory & Performance Optimizations

Peak memory should stay below 200 MB. Optimizations include:

Reducing main canvas resolution based on device capability (scale factors 0.8, 0.6, 0.5).

Lowering AR camera resolution to match the canvas.

Optimizing request handling: bypass base64 conversion by creating ArrayBuffer directly in C++ and exposing it via JSBinding.

Canvas‑to‑GPU upload: use dirty‑flag checks to avoid unnecessary texture uploads.

Video recording resolution: limit to 720p or 540p depending on device.

Conclusion

The AR air‑writing feature attracted many users and demonstrated a complete pipeline from brush input to 3D rendering, playback, and efficient resource usage. Ongoing work will continue to refine the AR toolchain and explore new interactive experiences.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

frontend developmentARWebGLOasis EngineMini‑Program
Alipay Experience Technology
Written by

Alipay Experience Technology

Exploring ultimate user experience and best engineering practices

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.