Run Full AI Models Directly in the Browser with Transformers.js v4
Transformers.js v4 rewrites its WebGPU runtime in C++ and compiles to WASM, delivering ten‑fold faster build times, 10% smaller bundles, and up to four‑fold speedups for BERT‑style models, while supporting over 20 new architectures such as Qwen3.5 and enabling offline, privacy‑preserving AI inference directly in the browser.
