Build a No‑VNC Powered Isaac Sim Robot Demo with PAI‑DSW
This guide walks through setting up a PAI‑DSW environment, downloading Isaac Sim assets, configuring noVNC, launching a software‑in‑the‑loop robot simulation, and running a perception pipeline that combines FastSAM detection with FoundationPose pose estimation and ICP refinement.
Introduction
The final installment of the PAI Physical AI Notebook series demonstrates how to use the new noVNC feature in DSW together with Isaac Sim's collaborative robot system Cortex to build a software‑in‑the‑loop (SIL) verification setup.
Environment Preparation
Create a DSW instance using the following image and instance type:
Image:
dsw-registry-vpc.cn-beijing.cr.aliyuncs.com/pai-training-algorithm/isaac-sim:isaacsim500-nb5-v2-20250902Instance type: ecs.ebmgn9t.48xlarge No special dataset configuration is required for this best‑practice.
Data and Model Download
import os
from pathlib import Path
local_dir = Path("/mnt/data/notebook5/") # cache directory
print(f"Downloading data and code to: {local_dir}")
print("Starting download...")
package = "bin_picking_demo.tar"
download_from_oss('aigc-data/isaac/nb5/', package, str(local_dir))
print("Download complete")
print("Extracting data...")
zip_file = os.path.join(local_dir, package)
!tar -xvf {zip_file} -C {local_dir}
!rm {zip_file}
print("Extraction complete") import os
from pathlib import Path
local_dir = Path("/root/FoundationPose") # cache directory
print(f"Downloading pretrained model to: {local_dir}")
print("Starting download...")
package = "weights.tar"
download_from_oss('aigc-data/isaac/nb5/', package, str(local_dir))
print("Download complete")
print("Extracting model...")
zip_file = os.path.join(local_dir, package)
!tar -xvf {zip_file} -C {local_dir}
!rm {zip_file}
print("Extraction complete")VNC Setup
# Install Python environment
apt update
apt install python3-venv
# Start VNC server
/opt/TurboVNC/bin/vncserver :1 -geometry 3840x2160
# Install noVNC plugin
/etc/dsw/runtime/bin/pai-dsw runtime plugin install novnc
# Start noVNC daemon
/etc/dsw/runtime/bin/pai-dsw runtime plugin start-daemon novncAfter the plugins start, open the DSW gateway URL and append vnc.html to access the noVNC interface.
Running Validation
In the noVNC terminal, execute the following commands to run the simulation:
cd /root/bin_picking_demo
/isaac-sim/python.sh sim_main.py --component "mustard_bottle"
# Try changing the component, e.g., "cracker_box"The Isaac Lab window shows the robot arm executing pick‑and‑place actions based on the FoundationPose model output, while the terminal displays FastSAM detecting object positions.
Core Code Overview
System Entry Point
The main script /mnt/workspace/notebook5/bin_picking_demo/sim_main.py parses configuration, sets up the simulation world, adds the robot, cameras, and tasks, and launches a decider network that coordinates perception and control.
if __name__ == "__main__":
opt = tyro.cli(SimulationConfig)
opt, camera_params, weights_path, standard_mask_path = setup_configuration(opt)
ctx = mp.get_context('spawn')
data_queue = ctx.Queue()
hand_data_queue = ctx.Queue()
detect_queue = ctx.Queue()
hand_detect_queue = ctx.Queue()
debug = True
process = ctx.Process(target=inference, args=(data_queue, detect_queue, hand_data_queue, hand_detect_queue, camera_params, opt.mesh_file, standard_mask_path, weights_path, debug))
process.start()
main(opt, data_queue, detect_queue, hand_data_queue, hand_detect_queue)
process.terminate()
process.join()
print('## Sub process is terminated.')Perception Subprocess
Located at
/mnt/workspace/notebook5/bin_picking_demo/foundationpose/multiprocess_foundationpose_infer_sim.py, this process runs FastSAM detection and FoundationPose pose estimation, feeding results back via queues.
def inference(data_queue, detect_queue, hand_data_queue, hand_detect_queue, camera_params, mesh_file, standard_mask_path, weights_path, debug):
pose_estimator = FoundationPoseInfer(camera_params, mesh_file, standard_mask_path, weights_path, debug)
pose_tuner = ICPByHandCamera(camera_params, mesh_file, weights_path)
while True:
if not data_queue.empty():
rgb, depth, failed, reset = data_queue.get()
if reset:
pose_estimator = FoundationPoseInfer(camera_params, mesh_file, standard_mask_path, weights_path, debug)
continue
pose, mask = pose_estimator.detect(rgb, depth, failed)
detect_queue.put([pose, mask, pose_estimator.extent_bbox])
# similar handling for hand_data_queue ...
time.sleep(0.01)Simulation Module
The main function creates the Cortex world, adds a UR10 robot, cameras, and a bin‑picking task, then runs the simulation while consuming perception results.
def main(opt, data_queue, detect_queue, hand_data_queue, hand_detect_queue):
world = CortexWorld()
robot = world.add_robot(CortexUr10LongSuction(name="robot", prim_path="{}/ur10_long_suction".format(env_path), robot_type=opt.robot_type))
camera_prim1 = world.stage.DefinePrim("/World/Camera1", "Camera")
world.add_task(BinPickingStackedTask(opt, mechanical_part_usd, usd_scale, opt.num_object, rp_head))
world.reset()
world.add_decider_network(behavior.make_decider_network(data_queue, detect_queue, hand_data_queue, hand_detect_queue, robot, target_pose, opt, rp_head, rp_hand, camera_prim1, camera_prim_hand))
world.run(simulation_app, render=True, loop_fast=False, play_on_entry=True)
print('## Simulation_app is closed. ##')Conclusion
By leveraging PAI‑DSW’s noVNC visual environment and Isaac Sim’s toolchain, developers can rapidly prototype and validate complex robot perception and interaction pipelines, then transfer them zero‑shot to real hardware, greatly improving development efficiency and quality for physical AI systems.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
