Artificial Intelligence 5 min read

Run Transformers.js in the Browser with Google’s window.ai – Live Demo

This article introduces how to use the JavaScript library transformers.js directly in the browser, leveraging Google’s built‑in window.ai models, outlines supported AI tasks, demonstrates a live demo, and provides the core implementation code for model loading, worker communication, and text generation.

Code Mala Tang
Code Mala Tang
Code Mala Tang
Run Transformers.js in the Browser with Google’s window.ai – Live Demo
Previously introduced how to use window.ai in the console, which was not ideal; this article shows a new approach using transformers.js.

transformers.js is a JavaScript library that runs directly in the browser without a server. It supports the following capabilities:

📝 Natural Language Processing: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple‑choice, and text generation.

🖼️ Computer Vision: image classification, object detection, and segmentation.

🗣️ Audio: automatic speech recognition and audio classification.

🐙 Multimodal: zero‑shot image classification.

It also supports Google’s built‑in models; see the GitHub repository for details.

Below is a live demo that combines transformers.js with Google’s built‑in models.

Demo Example

The integration runs quickly.

Demo URL: https://windowai.miniwa.site/

Main features:

Detect whether window.ai is supported.

After loading the model, enable conversational chat.

To enable window.ai support in the browser, refer to the linked article.

Model Implementation Details

First, load the model using transformers.js’s simple API:

<code>pipeline('text-generation', 'Xenova/gemini-nano');</code>

The author uses a singleton pattern:

<code>class TextGenerationPipeline {
    static model_id = 'Xenova/gemini-nano';
    static instance = null;

    static async getInstance() {
        this.instance ??= pipeline('text-generation', this.model_id);
        return this.instance;
    }
}</code>

The main workflow involves loading and communication, with a Web Worker handling time‑consuming AI interactions. The worker code is:

<code>import {
    pipeline,
    InterruptableStoppingCriteria,
    RawTextStreamer,
} from '@xenova/transformers';

async function generate(messages) {
    const generator = await TextGenerationPipeline.getInstance();

    const cb = (output) => {
        self.postMessage({
            status: 'update',
            output,
        });
    };

    const streamer = new RawTextStreamer(cb);
    self.postMessage({ status: 'start' });

    const output = await generator(messages, {
        streamer,
        stopping_criteria,

        // Greedy search
        top_k: 1,
        temperature: 0,
    });

    if (output[0].generated_text.length === 0) {
        // No response was generated
        self.postMessage({
            status: 'update',
            output: ' ',
            tps: null,
            numTokens: 0,
        });
    }

    // Send the output back to the main thread
    self.postMessage({
        status: 'complete',
        output: output[0].generated_text,
    });
}

async function load() {
    self.postMessage({
        status: 'loading',
        data: '正在加载模型...',
    });

    // Get model instance
    const generator = await TextGenerationPipeline.getInstance(x => {
        self.postMessage(x);
    });

    self.postMessage({
        status: 'loading',
        data: '正在加载模型...',
    });

    // Check if ready
    await generator('1+1=');
    self.postMessage({ status: 'ready' });
}

// Listen for messages
self.addEventListener('message', async (e) => {
    const { type, data } = e.data;

    switch (type) {
        case 'load':
            load().catch((e) => {
                self.postMessage({
                    status: 'error',
                    data: e,
                });
            });
            break;

        case 'generate':
            stopping_criteria.reset();
            generate(data);
            break;

        case 'interrupt':
            stopping_criteria.interrupt();
            break;

        case 'reset':
            stopping_criteria.reset();
            break;
    }
});
</code>

The worker listens for messages from the main thread and interacts with the model accordingly.

self acts like the window object inside the worker.

On a “load” command, the model is loaded and internally tested.

On a “generate” command, the generator produces text and posts the result back.

JavaScriptweb workerstext generationbrowser AItransformers.jswindow.ai
Code Mala Tang
Written by

Code Mala Tang

Read source code together, write articles together, and enjoy spicy hot pot together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.