Tagged articles
6 articles
Page 1 of 1
Machine Heart
Machine Heart
May 23, 2026 · Artificial Intelligence

How FlashAR Achieves 22.9× Speedup with Only 0.05% of Training Data

FlashAR transforms pretrained autoregressive image models into highly parallel generators, delivering up to 22.9× end-to-end speedup while using just 0.05% of the original training data and preserving generation quality, thanks to intermediate branching, a learnable fusion gate, and a two-stage adaptation process.

FlashARautoregressive generationimage synthesis
0 likes · 10 min read
How FlashAR Achieves 22.9× Speedup with Only 0.05% of Training Data
SuanNi
SuanNi
Feb 26, 2026 · Artificial Intelligence

How BitDance’s 2.6B‑Parameter Model Beats 14B Counterparts with 8.7× Speedup

BitDance’s new multimodal AI model achieves an 8.7‑fold inference acceleration using only 2.6 billion parameters, surpasses 14‑billion‑parameter state‑of‑the‑art architectures in image generation quality, and introduces binary visual tokens, a binary diffusion head, and next‑block diffusion for efficient parallel autoregressive prediction.

AIBinary TokenizationVision Transformers
0 likes · 11 min read
How BitDance’s 2.6B‑Parameter Model Beats 14B Counterparts with 8.7× Speedup
AIWalker
AIWalker
Apr 28, 2025 · Artificial Intelligence

SimpleAR: Autoregressive Visual Generation at 1024×1024 Using Only 0.5B Parameters

SimpleAR is a minimalist autoregressive visual generation framework that, with only 0.5 B parameters, achieves competitive 1024×1024 image synthesis through a three‑stage pipeline of large‑scale pretraining, supervised fine‑tuning, and GRPO‑based reinforcement learning, and demonstrates significant inference speedups using KV‑cache, vLLM, and speculative decoding.

Inference Accelerationautoregressive generationbenchmark
0 likes · 14 min read
SimpleAR: Autoregressive Visual Generation at 1024×1024 Using Only 0.5B Parameters
Tencent Cloud Developer
Tencent Cloud Developer
Apr 10, 2025 · Artificial Intelligence

The Magic of GPT‑4o: Technical Overview and Speculated Architecture

GPT‑4o combines extremely long‑form text generation, high‑quality image creation and interactive editing by likely using an autoregressive multimodal transformer that tokenizes visuals via VQ‑VAE/GAN pipelines, trained on massive data and refined through fine‑tuning and RLHF, offering a unified model for generation, editing, and understanding.

GPT-4oVQ-VAEautoregressive generation
0 likes · 17 min read
The Magic of GPT‑4o: Technical Overview and Speculated Architecture
AIWalker
AIWalker
Feb 28, 2025 · Artificial Intelligence

FlexTok: Reconstruct Images with as Few as 8 Tokens – Variable‑Length Tokenizer Beats TiTok

FlexTok is a flexible‑length 1‑D image tokenizer that can resample pictures into as few as 1‑256 discrete tokens, achieving superior reconstruction (FID) and autoregressive generation quality compared with TiTok, thanks to nested random dropout, causal masks and a flow‑based decoder evaluated on ImageNet and DFN.

FlexTokVision Transformerautoregressive generation
0 likes · 21 min read
FlexTok: Reconstruct Images with as Few as 8 Tokens – Variable‑Length Tokenizer Beats TiTok