Artificial Intelligence 5 min read

NeurIPS 2024 Best Paper Introduces Visual Autoregressive Modeling (VAR) for Image Generation

A recent NeurIPS 2024 best‑paper award highlights a novel Visual Autoregressive Modeling (VAR) approach that uses multi‑scale token prediction to improve image generation, while the surrounding article also mentions a free book giveaway and a legal dispute involving the paper's author.

Architecture Digest
Architecture Digest
Architecture Digest
NeurIPS 2024 Best Paper Introduces Visual Autoregressive Modeling (VAR) for Image Generation

The article begins with a promotional notice: users can reply with the keyword "5000" to receive a free copy of the book "Programmer Book Resources" from the backend menu.

It then reports that intern Tian, who worked in ByteDance's commercialization technology department, co‑authored a paper that received the NeurIPS 2024 Best Paper award and achieved one of the highest reviewer scores (7, 8, 8, 8).

The paper, titled "Visual Autoregressive Modeling (VAR)", proposes a new paradigm for image generation that departs from traditional raster‑scan token prediction by predicting the next scale or resolution in a coarse‑to‑fine manner.

VAR consists of two training stages: (1) a multi‑scale VQ‑VAE encodes images into K token maps, and (2) a VAR Transformer predicts higher‑resolution token maps from lower‑resolution ones using masked attention and cross‑entropy loss.

The approach enables autoregressive Transformers to learn visual distributions more efficiently and achieve better generalization, allowing AR models to surpass diffusion Transformers in image generation.

In parallel, the article notes that ByteDance has filed a lawsuit against Tian for code tampering, seeking 8 million CNY in damages and a public apology.

artificial intelligencecomputer visiondeep learningNeurIPSVARVisual Autoregressive Modeling
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.