Artificial Intelligence 5 min read

NeurIPS 2024 Best Paper Introduces Visual Autoregressive Modeling (VAR) for Image Generation

A recent NeurIPS 2024 best‑paper award highlights a novel Visual Autoregressive Modeling (VAR) approach that uses multi‑scale token prediction to improve image generation, while the surrounding article also mentions a free book giveaway and a legal dispute involving the paper's author.

Architecture Digest

Dec 5, 2024

NeurIPS 2024 Best Paper Introduces Visual Autoregressive Modeling (VAR) for Image Generation

The article begins with a promotional notice: users can reply with the keyword "5000" to receive a free copy of the book "Programmer Book Resources" from the backend menu.

It then reports that intern Tian, who worked in ByteDance's commercialization technology department, co‑authored a paper that received the NeurIPS 2024 Best Paper award and achieved one of the highest reviewer scores (7, 8, 8, 8).

The paper, titled "Visual Autoregressive Modeling (VAR)", proposes a new paradigm for image generation that departs from traditional raster‑scan token prediction by predicting the next scale or resolution in a coarse‑to‑fine manner.

VAR consists of two training stages: (1) a multi‑scale VQ‑VAE encodes images into K token maps, and (2) a VAR Transformer predicts higher‑resolution token maps from lower‑resolution ones using masked attention and cross‑entropy loss.

The approach enables autoregressive Transformers to learn visual distributions more efficiently and achieve better generalization, allowing AR models to surpass diffusion Transformers in image generation.

In parallel, the article notes that ByteDance has filed a lawsuit against Tian for code tampering, seeking 8 million CNY in damages and a public apology.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Artificial Intelligence Computer Vision Deep Learning NeurIPS Var Visual Autoregressive Modeling

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.