GigaWorld-1 Tops WorldArena Benchmark, Surpassing Google and Nvidia

GigaWorld-1, the latest embodied world model from Jiji Vision, clinched the global #1 spot on the WorldArena benchmark—beating Google, Nvidia, and Alibaba—with a comprehensive score over 60, excelling in physics adherence (+16%), near‑perfect 3D accuracy, and leading visual quality, while leveraging explicit action modeling, a differentiable physics engine, massive robot video data, and open‑source releases that have already attracted over 16,000 downloads.

Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
GigaWorld-1 Tops WorldArena Benchmark, Surpassing Google and Nvidia

Jiji Vision recently released GigaWorld-1, the newest embodied world model, which achieved the top rank on the authoritative WorldArena benchmark, surpassing leading teams from Google, Nvidia, Alibaba and other top academic and industry groups. It is the only model to break the 60‑point comprehensive score threshold.

Performance Highlights

Physics Adherence : improves by 16% over the second‑place model.

3D Accuracy : reaches near‑perfect scores.

Visual Quality : leads the benchmark across visual metrics.

WorldArena Benchmark

WorldArena is a rigorous evaluation suite jointly created by experts from eight top universities and research institutes—including Tsinghua University, Princeton University, National University of Singapore, Peking University, Hong Kong University, the Chinese Academy of Sciences, Shanghai Jiao‑Tong University and the University of Science and Technology of China. It assesses models on 16 detailed core indicators and three real‑world application tasks, testing perception precision, physical law understanding, 3‑D spatial cognition, and action prediction.

Technical Architecture

GigaWorld-1 is an Action‑Conditioned World Model (AC‑WM) that extends the EmbodieDreamer architecture released by Jiji Vision in July 2025. The design introduces explicit action modeling to guarantee geometric consistency during video generation and incorporates a differentiable physics engine that extracts precise mechanical parameters for realistic physical interaction simulation. Training leveraged over ten thousand hours of high‑quality real‑robot operation video, markedly improving generalization and action fidelity in open‑world scenarios.

Open‑Source Release and Community Impact

The core code and part of the dataset have been open‑sourced and serve as the official baseline for the upcoming GigaBrain Challenge at CVPR 2026. Within half a month, the HuggingFace repository recorded more than 16,000 downloads, reflecting strong recognition from both academia and industry.

Series Evolution and Related Models

Earlier, GigaWorld‑0 demonstrated the first verified case where world‑model‑generated data significantly boosted real‑robot (VLA) performance, earning 1.5k+ GitHub stars. GigaWorld‑Policy, another branch, achieves a ten‑fold increase in inference speed and training efficiency and raises task success rates by 30%, underscoring the rapid progress of embodied intelligence.

Vision for Physical‑World AGI

Jiji Vision positions GigaWorld‑1 as a foundational "visual‑real, geometry‑precise, physics‑accurate" embodied world model, aiming to provide the data and architectural bedrock for general‑purpose AI to operate reliably in the physical world.

WorldArena benchmark illustration
WorldArena benchmark illustration
open-sourcebenchmarkembodied AIphysics simulationworld model
Machine Learning Algorithms & Natural Language Processing
Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.