Data Party THU
Nov 16, 2025 · Artificial Intelligence
How X‑VLA Enables 120‑Minute Unassisted Robot Clothing Folding with a 0.9B Model
The X‑VLA paper introduces a 0.9‑billion‑parameter, fully open‑source embodied model that uses a learnable soft‑prompt and divide‑and‑conquer encoding to handle heterogeneous robot vision inputs, achieving a record‑breaking 120‑minute autonomous clothing‑folding task while surpassing benchmarks across five simulation environments.
Embodied AIMultimodal LearningRobotics
0 likes · 7 min read
