Tagged articles

autoregressive generation

6 articles · Page 1 of 1

May 23, 2026 · Artificial Intelligence

How FlashAR Achieves 22.9× Speedup with Only 0.05% of Training Data

FlashAR transforms pretrained autoregressive image models into highly parallel generators, delivering up to 22.9× end-to-end speedup while using just 0.05% of the original training data and preserving generation quality, thanks to intermediate branching, a learnable fusion gate, and a two-stage adaptation process.

FlashARParallel Decodingautoregressive generation

0 likes · 10 min read

How FlashAR Achieves 22.9× Speedup with Only 0.05% of Training Data

SuanNi

Feb 26, 2026 · Artificial Intelligence

How BitDance’s 2.6B‑Parameter Model Beats 14B Counterparts with 8.7× Speedup

BitDance’s new multimodal AI model achieves an 8.7‑fold inference acceleration using only 2.6 billion parameters, surpasses 14‑billion‑parameter state‑of‑the‑art architectures in image generation quality, and introduces binary visual tokens, a binary diffusion head, and next‑block diffusion for efficient parallel autoregressive prediction.

AIBinary TokenizationVision Transformers

0 likes · 11 min read

How BitDance’s 2.6B‑Parameter Model Beats 14B Counterparts with 8.7× Speedup

AIWalker

Apr 28, 2025 · Artificial Intelligence

SimpleAR: Autoregressive Visual Generation at 1024×1024 Using Only 0.5B Parameters

SimpleAR is a minimalist autoregressive visual generation framework that, with only 0.5 B parameters, achieves competitive 1024×1024 image synthesis through a three‑stage pipeline of large‑scale pretraining, supervised fine‑tuning, and GRPO‑based reinforcement learning, and demonstrates significant inference speedups using KV‑cache, vLLM, and speculative decoding.

Inference AccelerationPretrainingautoregressive generation

0 likes · 14 min read

SimpleAR: Autoregressive Visual Generation at 1024×1024 Using Only 0.5B Parameters

Tencent Cloud Developer

Apr 10, 2025 · Artificial Intelligence

The Magic of GPT‑4o: Technical Overview and Speculated Architecture

GPT‑4o combines extremely long‑form text generation, high‑quality image creation and interactive editing by likely using an autoregressive multimodal transformer that tokenizes visuals via VQ‑VAE/GAN pipelines, trained on massive data and refined through fine‑tuning and RLHF, offering a unified model for generation, editing, and understanding.

GPT-4oVQ-VAEautoregressive generation

0 likes · 17 min read

The Magic of GPT‑4o: Technical Overview and Speculated Architecture

AIWalker

Feb 28, 2025 · Artificial Intelligence

FlexTok: Reconstruct Images with as Few as 8 Tokens – Variable‑Length Tokenizer Beats TiTok

FlexTok is a flexible‑length 1‑D image tokenizer that can resample pictures into as few as 1‑256 discrete tokens, achieving superior reconstruction (FID) and autoregressive generation quality compared with TiTok, thanks to nested random dropout, causal masks and a flow‑based decoder evaluated on ImageNet and DFN.

FlexTokVision Transformerautoregressive generation

0 likes · 21 min read

FlexTok: Reconstruct Images with as Few as 8 Tokens – Variable‑Length Tokenizer Beats TiTok

AIWalker

Feb 22, 2025 · Artificial Intelligence

FlexTok Achieves High‑Quality Visual Reconstruction with as Few as 8 Tokens, Outperforming TiTok

FlexTok introduces a variable‑length 1‑D image tokenizer that can reconstruct images with as few as eight tokens, surpasses TiTok in FID and MAE across multiple token budgets, and serves as a hierarchical visual vocabulary for autoregressive image generation.

AI researchFlexTokautoregressive generation

0 likes · 23 min read

FlexTok Achieves High‑Quality Visual Reconstruction with as Few as 8 Tokens, Outperforming TiTok