Turning 3D Avatars into Video: Puppeteer, H5 Frames & FFmpeg Workflow
This article explains how to overcome performance and integration challenges of 3D avatar rendering across multiple scenarios by exporting avatars as video or GIF resources using a Puppeteer‑driven H5 frame capture pipeline combined with FFmpeg video synthesis, detailing the evaluation of alternatives and the final implementation steps.
Background
In the "My Avatar" feature, users configure a 3D avatar that needs to be displayed in many contexts such as social‑media profile pictures, animated stickers, lock‑screen animations, and personalized wallpapers. Real‑time 3D rendering on each device causes high CPU/GPU load, excessive power consumption, and raises the integration barrier for third‑party platforms.
Problem Statement
The core issue is the dependency on a full 3D runtime for every display, which is impractical for mobile and web scenarios. The goal is to pre‑export the configured avatar into lightweight animation assets (video or GIF) that can be directly consumed without a 3D engine, while preserving visual fidelity and keeping the integration cost low.
Solution Options Evaluated
H5 generate animation frames, then combine on H5/client into video – Suitable for short clips but client‑side FFmpeg or WebCodec is either too slow (1/20 of native speed) or suffers from compatibility issues.
Blender API server‑side rendering – Produces high‑quality output but introduces heavy maintenance (blend file management, asset versioning) and inconsistencies between server‑rendered and client‑rendered results.
Puppeteer‑driven H5 frame capture + FFmpeg video synthesis – Leverages existing H5 rendering logic, ensures consistent appearance, and allows batch processing on the server.
Chosen Approach
The team selected the third option: use Puppeteer (or Playwright) to launch the H5 avatar page in a headless browser, capture each animation frame, and then stitch the frames into a video with FFmpeg on the server.
Implementation Details
1. Frame‑output page
The H5 page is modularized so that the same model‑loading code works both in the user‑facing page and the server‑side rendering page. When Puppeteer opens the page, it injects the user configuration and a unique task ID into window. A hidden "Export Video" button triggers frame capture; for local debugging the button can also be clicked manually.
2. Puppeteer workflow
A single Chromium instance is launched once and kept alive. Each export task creates a new tab, navigates to the target URL, injects data, and waits for the export button to appear. The script enables request interception to cache static resources locally, reducing bandwidth and speeding up page load.
// Launch browser without sandbox
const browser = await puppeteer.launch({
headless: 'new',
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
async function exportAnimate() {
const page = await browser.newPage();
// ...set up interception, navigate, click export, wait for completion
await page.close();
}Resource caching is handled by listening to request and response events; cached files are stored under a designated directory and served directly when available.
// Request interception example
await page.setRequestInterception(true);
page.on('request', async (request) => {
const url = request.url();
if (isCacheableFile(url) && fs.existsSync(cachePath)) {
const data = fs.readFileSync(cachePath);
await request.respond({status: 200, body: data, headers: {'Access-Control-Allow-Origin': '*'}});
return;
}
request.continue();
});A unique task ID is generated with nanoid() and used to set a dedicated download folder for the ZIP of captured frames.
const taskId = nanoid();
const client = await page.createCDPSession();
await client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: path.resolve('./temp/')});
await page.goto(targetUrl, {waitUntil: 'domcontentloaded', timeout: 0});
await page.evaluate((data, id, animate) => {
window.__INIT_DATA__ = data;
window.__TASKID__ = id;
window.__ANIMATE__ = animate;
}, config, taskId, animate);After the export button is clicked, the script waits for a DOM element (e.g., #exported) that signals the ZIP is ready, then pauses briefly to allow the download to finish.
const btn = await page.waitForSelector('#export-btn', {timeout: 10000});
await btn.click();
await page.waitForSelector('#exported', {timeout: 30000});
await new Promise(r => setTimeout(r, 2000));3. Video synthesis with FFmpeg
Once the frame ZIP is downloaded and extracted, FFmpeg is invoked to assemble the frames into an MP4 video. The command uses a numeric pattern for input frames, sets the desired framerate, scaling, codec, and output format.
const inputPattern = path.join(framesDir, frames[0].replace(/\d+/, '%d'));
const ffmpegArgs = [
'-framerate', fps.toString(),
'-start_number', '0',
'-i', inputPattern,
'-vf', `scale=${width}:${height}`,
'-c:v', 'libx264',
'-preset', 'medium',
'-crf', '23',
'-pix_fmt', 'yuv420p',
'-f', 'mp4',
'-y', outputPath
];
const ffmpegProcess = spawn('ffmpeg', ffmpegArgs);
ffmpegProcess.on('close', (code) => {
if (code === 0) {
// video synthesis completed
}
});Conclusion
By comparing multiple animation‑generation paths, the team settled on a Puppeteer‑driven H5 frame capture combined with FFmpeg video synthesis. This solution maintains visual consistency, enables asynchronous server‑side processing, reduces integration effort for downstream scenarios, and provides a scalable foundation for large‑scale avatar generation and distribution.
Key Images
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
