End-to-End Navigation Model Training with Isaac Sim, MobilityGen, and Cosmos Augmentation
This tutorial walks through a complete workflow for building a navigation model using Isaac Sim and MobilityGen to generate synthetic data, applying Cosmos‑Transfer1‑7B for visual data augmentation, training the X‑Mobility model via imitation learning, converting it for ROS2 deployment, and performing software‑in‑the‑loop validation.
Overview
This guide shows an end‑to‑end workflow for robot navigation on Alibaba Cloud PAI using Isaac Sim for data synthesis, MobilityGen for dataset generation, optional visual augmentation with Cosmos‑Transfer1‑7B, imitation learning with the open‑source X‑Mobility model, model conversion to ONNX and TensorRT, ROS2 integration, and software‑in‑the‑loop validation.
1. Load a 3D scene from the Isaac Asset public dataset
Start a Data Science Workspace (DSW) livestream session and launch Isaac Sim headlessly:
export ACCEPT_EULA=Y
PUBLIC_IP=$(curl -s ifconfig.me) && \
/isaac-sim/runheadless.sh \
--/persistent/isaac/asset_root/default="/mnt/data/isaac_tmp/isaac_asset/Assets/Isaac/5.0" \
--/app/livestream/publicEndpointAddress=$PUBLIC_IP \
--/app/livestream/port=49100Load the warehouse scene:
/mnt/isaac_assets/5.0/Isaac/Environments/Simple_Warehouse/warehouse_multiple_shelves.usd2. Create an occupancy map
Open Tools → Robotics → Occupancy Map in Isaac Sim and set:
Origin: X=2.0, Y=0.0, Z=0.0
Upper Bound: X=10.0, Y=20.0, Z=2.0
Lower Bound: X=-14.0, Y=-18.0, Z=0.1
Click Calculate then Visualize Image. Save the generated PNG and edit the YAML file to reference it:
~/MobilityGenData/maps/warehouse_multiple_shelves/map.yamlReplace the image entry with image: map.png and store the PNG as ~/MobilityGenData/maps/warehouse_multiple_shelves/map.png.
3. Launch the MobilityGen plugin
Enable the plugin via Window → Extensions → MobilityGen UI . In the UI set:
Stage :
/mnt/isaac_assets/5.0/Isaac/Environments/Simple_Warehouse/warehouse_multiple_shelves.usdOccupancy Map : ~/MobilityGenData/maps/warehouse_multiple_shelves/map.yaml Robot : CarterRobot Scenario : KeyboardTeleoperationScenario (or RandomPathFollowingScenario for automatic data generation)
Press Build to generate the dataset.
4. Record trajectories
In the MobilityGen UI click Start recording , drive the robot manually (or let the random scenario run), then click Stop recording . Recordings are saved to:
~/MobilityGenData/recordings5. Replay and render video
Replay the recordings and render sensor images with the provided script:
cd /isaac-sim && \
/isaac-sim/python.sh \
standalone_examples/replicator/mobility_gen/replay_directory.py \
--render_interval 10 \
--enable isaacsim.replicator.mobility_gen.examplesRendered frames and sensor data are stored in:
~/MobilityGenData/replays6. Visual data augmentation with Cosmos‑Transfer1‑7B
After deploying the Cosmos model via PAI‑ModelGallery, run the following Python script to convert image sequences to video, upload them to the Cosmos service, retrieve the enhanced video, and split it back into frames:
import cv2, json, pathlib, shutil, requests, gradio_client
# ... (functions convert_sequence_to_video, split_video_to_frames, cosmos_sync_with_upload, create_cosmos_request, process_and_augment_replays) ...
if __name__ == "__main__":
import os
os.makedirs('/root/MobilityGenData/cosmos_augmented_videos', exist_ok=True)
process_and_augment_replays(output_dir='/root/MobilityGenData/cosmos_augmented_videos')The script produces augmented videos and corresponding frame directories alongside the original data.
7. Imitation learning with X‑Mobility
Use the open‑source X‑Mobility model ( https://github.com/NVlabs/X-Mobility) as the backbone. Submit a distributed training job on PAI‑DLC (environment variables dsw_region and PAI_WORKSPACE_ID must be set):
import os, time
# Build job name
display_name = f"train_xmobility_for_isaac_{day}_{hour}-{minute}"
region_id = os.getenv("dsw_region")
workspace_id = os.getenv("PAI_WORKSPACE_ID")
image_uri = f"dsw-registry.{region_id}.cr.aliyuncs.com/pai-training-algorithm/isaac-sim:x-mobility-v10"
# Create DLC client and submit a PyTorch job (datasets, commands omitted for brevity)8. Model conversion
After training, convert the checkpoint to ONNX and then to a TensorRT engine:
%cd /X-MOBILITY
python3 onnx_conversion.py -p /mnt/data/notebook3/nav2_output/checkpoints/last.ckpt -o /tmp/x_mobility.onnx
python3 trt_conversion.py -o /tmp/x_mobility.onnx -t /tmp/x_mobility.engine9. Deploy ROS2 package
Create a ROS2 workspace, link the X‑Mobility navigator package, and build:
mkdir -p ~/ros2_ws/src
ln -s /X-MOBILITY/ros2_deployment/x_mobility_navigator ~/ros2_ws/src/x_mobility_navigator
cd ~/ros2_ws && colcon build --symlink-install10. Software‑in‑the‑Loop validation
Start a VNC server to provide a graphical UI inside DSW:
/opt/TurboVNC/bin/vncserver :0 -geometry 4000x3000Launch Isaac Sim with ROS2 support and load the Carter navigation example:
source /opt/ros/humble/setup.bash
cd ~/ros2_ws && source install/setup.bash
source ~/.bashrc
ACCEPT_EULA=Y /isaac-sim/runapp.sh \
--/persistent/isaac/asset_root/default="/mnt/isaac_assets/5.0"
# In the UI: Robotics Examples → ROS2 → Navigation → Carter Navigation → Load Sample Scene → PlayVerify that ROS2 topics such as /cmd_vel and /front_stereo_camera/left/image_raw are active.
11. Run the X‑Mobility navigator
Launch the navigation node:
source ~/ros2_ws/install/setup.bash
ros2 launch x_mobility_navigator x_mobility_navigator.launch.pySet a 2D goal pose in the map UI; the robot should move toward the target in simulation.
Summary
This notebook demonstrates a complete PAI‑powered workflow: synthetic navigation data generation with Isaac Sim, optional visual augmentation via Cosmos‑Transfer1‑7B, large‑scale imitation learning using X‑Mobility, conversion to ONNX/TensorRT, ROS2 integration, and end‑to‑end software‑in‑the‑loop validation. The augmented‑data‑trained model shows improved generalization and robustness, providing a practical solution for Sim2Real robot navigation.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
