Manifold AI Tops WorldScore, Outperforming Li Fei‑Fei’s Team with an Action Model
Manifold AI’s WorldScape model achieved the highest WorldScore, surpassing teams from Li Fei‑Fei, MIT, Alibaba and others, thanks to its unified generation‑control architecture, real‑time 6‑16 FPS performance, high spatial‑intelligence density despite a smaller parameter count, and the release of WorldScape v0.1 and WorldScape Policy, all backed by substantial funding and a veteran research team.
WorldScore, the internationally recognized benchmark for general world models, evaluates models across controllability, generation quality and thousands of diverse scenarios, providing a stringent test of a model’s capabilities.
Manifold AI’s self‑developed model WorldScape broke through this high barrier and secured the top overall score, outperforming competitors such as Li Fei‑Fei’s team, MIT, Alibaba, Runway, Zhipu, MiniMax and Tencent Hunyuan.
The key to WorldScape’s advantage lies in the deep integration of generation and control . By adopting a unified action‑world‑state modeling framework, spatial displacement and object interaction are modeled within the same generation process, eliminating inconsistencies caused by multi‑module pipelines and enabling simultaneous support for navigation and manipulation.
During training, explicit 3‑D geometric perception and constraints are introduced, which preserves consistent spatial structure across long‑term interactions. This design mitigates the common geometric drift and structural collapse observed in prolonged generation.
WorldScape also delivers high visual quality in real‑time generation without relying on model compression or resolution reduction. On a single GPU it runs at approximately 6–16 FPS , achieving top rankings in visual fidelity and motion smoothness.
Despite having a parameter scale an order of magnitude smaller than many leading models, WorldScape exhibits the highest spatial‑intelligence density worldwide, demonstrating that compact models can achieve superior performance when architecture and training are carefully designed.
WorldScape incorporates a world‑state memory mechanism that shares and updates spatial information across time steps, providing the model with long‑term consistency—a crucial distinction between video‑generation models and true world models.
In early 2024 Manifold AI released WorldScape v0.1 , the first real‑time world model that simultaneously supports mobile and manipulation interactions and serves as a pre‑training foundation for robots. Later the same year it introduced WorldScape Policy , a model that combines world‑model‑based spatio‑temporal prediction with visual input to execute actions, surpassing existing VLA models in few‑shot and zero‑shot execution capabilities.
The company’s rapid progress is supported by a robust data pipeline that integrates ego‑centric, UMI and RL data collection devices, generating over 10 000 clips per day . This pipeline continuously produces high‑quality, scaled data for model improvement.
Manifold AI, founded in 2025, has secured five financing rounds within ten months, including a Pre‑A+ round led by Shunxi Fund and participation from Ginkgo Valley Capital, Jin Yu Mao Wu, and Tongchuang Weiye. The founding team comprises former senior engineers from SenseTime, Momenta, Xiaopeng, YuanRong, and other leading autonomous‑driving and AI firms, as well as a former youngest securities‑sector head, providing strong expertise in both model architecture and operational scaling.
Overall, WorldScape’s unified generation‑control design, efficient real‑time performance, compact yet powerful architecture, and extensive data infrastructure have propelled Manifold AI to the forefront of embodied world‑model research, establishing a new benchmark for action‑oriented AI systems.
Machine Learning Algorithms & Natural Language Processing
Focused on frontier AI technologies, empowering AI researchers' progress.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
