Can Robots Navigate Unseen Spaces with Only Language? EvoNav’s Zero‑Shot Vision‑Language Breakthrough
The EvoNav framework from Nanjing University of Science and Technology tackles the last‑hundred‑meter challenge of embodied navigation by integrating a Future Chain‑of‑Thought and a Historical Experience chain, achieving significant zero‑shot performance gains on VLN‑CE benchmarks and real‑world robot tests, with code released on GitHub.
Task Background
Vision‑Language Navigation in continuous environments (VLN‑CE) requires an embodied agent to understand natural‑language instructions and move freely in a physical space to reach a target. Zero‑shot approaches based on large language models (LLMs) often suffer from a lack of feedback and decision hallucinations, leading to cascading failures.
Core Contribution: EvoNav Evolutionary Paradigm
EvoNav mimics the human decision process History → Now → Future and introduces two complementary modules:
Future Chain‑of‑Thought (F‑CoT) : predicts future actions and landmarks, converting complex instructions into spatio‑temporal sub‑tasks so the agent can continuously anticipate the optimal direction.
Historical Experience Chain (H‑CoE) : maintains a dynamic experience repository that aggregates successful and failed trajectories. It provides:
Text trajectory experience : global navigation logic derived from past language‑action sequences.
Visual scene experience : uses CLIP to retrieve visually similar historical images, improving local perception reliability.
Experimental Results
EvoNav was evaluated on simulated benchmarks (R2R‑CE, NavRAG‑CE) and real‑world indoor scenes.
On R2R‑CE, success rate (SR) increased by 20 % and oracle success rate (OSR) by 21 % compared with the Open‑Nav baseline.
On the more challenging NavRAG‑CE dataset, SR gained an additional 6 % .
Real‑robot deployment on an omnidirectional wheeled platform demonstrated robust zero‑shot navigation in labs, corridors, and elevator halls.
Implementation
Code and model checkpoints are publicly available at:
https://github.com/daiguangzhao/EvoNav.gitHow this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
