Build a No‑VNC Powered Isaac Sim Robot Demo with PAI‑DSW

This guide walks through setting up a PAI‑DSW environment, downloading Isaac Sim assets, configuring noVNC, launching a software‑in‑the‑loop robot simulation, and running a perception pipeline that combines FastSAM detection with FoundationPose pose estimation and ICP refinement.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Build a No‑VNC Powered Isaac Sim Robot Demo with PAI‑DSW

Introduction

The final installment of the PAI Physical AI Notebook series demonstrates how to use the new noVNC feature in DSW together with Isaac Sim's collaborative robot system Cortex to build a software‑in‑the‑loop (SIL) verification setup.

Environment Preparation

Create a DSW instance using the following image and instance type:

Image:

dsw-registry-vpc.cn-beijing.cr.aliyuncs.com/pai-training-algorithm/isaac-sim:isaacsim500-nb5-v2-20250902

Instance type: ecs.ebmgn9t.48xlarge No special dataset configuration is required for this best‑practice.

Data and Model Download

import os
from pathlib import Path

local_dir = Path("/mnt/data/notebook5/")  # cache directory
print(f"Downloading data and code to: {local_dir}")
print("Starting download...")
package = "bin_picking_demo.tar"
download_from_oss('aigc-data/isaac/nb5/', package, str(local_dir))
print("Download complete")
print("Extracting data...")
zip_file = os.path.join(local_dir, package)
!tar -xvf {zip_file} -C {local_dir}
!rm {zip_file}
print("Extraction complete")
import os
from pathlib import Path

local_dir = Path("/root/FoundationPose")  # cache directory
print(f"Downloading pretrained model to: {local_dir}")
print("Starting download...")
package = "weights.tar"
download_from_oss('aigc-data/isaac/nb5/', package, str(local_dir))
print("Download complete")
print("Extracting model...")
zip_file = os.path.join(local_dir, package)
!tar -xvf {zip_file} -C {local_dir}
!rm {zip_file}
print("Extraction complete")

VNC Setup

# Install Python environment
apt update
apt install python3-venv

# Start VNC server
/opt/TurboVNC/bin/vncserver :1 -geometry 3840x2160

# Install noVNC plugin
/etc/dsw/runtime/bin/pai-dsw runtime plugin install novnc

# Start noVNC daemon
/etc/dsw/runtime/bin/pai-dsw runtime plugin start-daemon novnc

After the plugins start, open the DSW gateway URL and append vnc.html to access the noVNC interface.

Running Validation

In the noVNC terminal, execute the following commands to run the simulation:

cd /root/bin_picking_demo
/isaac-sim/python.sh sim_main.py --component "mustard_bottle"
# Try changing the component, e.g., "cracker_box"

The Isaac Lab window shows the robot arm executing pick‑and‑place actions based on the FoundationPose model output, while the terminal displays FastSAM detecting object positions.

Core Code Overview

System Entry Point

The main script /mnt/workspace/notebook5/bin_picking_demo/sim_main.py parses configuration, sets up the simulation world, adds the robot, cameras, and tasks, and launches a decider network that coordinates perception and control.

if __name__ == "__main__":
    opt = tyro.cli(SimulationConfig)
    opt, camera_params, weights_path, standard_mask_path = setup_configuration(opt)
    ctx = mp.get_context('spawn')
    data_queue = ctx.Queue()
    hand_data_queue = ctx.Queue()
    detect_queue = ctx.Queue()
    hand_detect_queue = ctx.Queue()
    debug = True
    process = ctx.Process(target=inference, args=(data_queue, detect_queue, hand_data_queue, hand_detect_queue, camera_params, opt.mesh_file, standard_mask_path, weights_path, debug))
    process.start()
    main(opt, data_queue, detect_queue, hand_data_queue, hand_detect_queue)
    process.terminate()
    process.join()
    print('## Sub process is terminated.')

Perception Subprocess

Located at

/mnt/workspace/notebook5/bin_picking_demo/foundationpose/multiprocess_foundationpose_infer_sim.py

, this process runs FastSAM detection and FoundationPose pose estimation, feeding results back via queues.

def inference(data_queue, detect_queue, hand_data_queue, hand_detect_queue, camera_params, mesh_file, standard_mask_path, weights_path, debug):
    pose_estimator = FoundationPoseInfer(camera_params, mesh_file, standard_mask_path, weights_path, debug)
    pose_tuner = ICPByHandCamera(camera_params, mesh_file, weights_path)
    while True:
        if not data_queue.empty():
            rgb, depth, failed, reset = data_queue.get()
            if reset:
                pose_estimator = FoundationPoseInfer(camera_params, mesh_file, standard_mask_path, weights_path, debug)
                continue
            pose, mask = pose_estimator.detect(rgb, depth, failed)
            detect_queue.put([pose, mask, pose_estimator.extent_bbox])
        # similar handling for hand_data_queue ...
        time.sleep(0.01)

Simulation Module

The main function creates the Cortex world, adds a UR10 robot, cameras, and a bin‑picking task, then runs the simulation while consuming perception results.

def main(opt, data_queue, detect_queue, hand_data_queue, hand_detect_queue):
    world = CortexWorld()
    robot = world.add_robot(CortexUr10LongSuction(name="robot", prim_path="{}/ur10_long_suction".format(env_path), robot_type=opt.robot_type))
    camera_prim1 = world.stage.DefinePrim("/World/Camera1", "Camera")
    world.add_task(BinPickingStackedTask(opt, mechanical_part_usd, usd_scale, opt.num_object, rp_head))
    world.reset()
    world.add_decider_network(behavior.make_decider_network(data_queue, detect_queue, hand_data_queue, hand_detect_queue, robot, target_pose, opt, rp_head, rp_hand, camera_prim1, camera_prim_hand))
    world.run(simulation_app, render=True, loop_fast=False, play_on_entry=True)
    print('## Simulation_app is closed. ##')

Conclusion

By leveraging PAI‑DSW’s noVNC visual environment and Isaac Sim’s toolchain, developers can rapidly prototype and validate complex robot perception and interaction pipelines, then transfer them zero‑shot to real hardware, greatly improving development efficiency and quality for physical AI systems.

simulationPythonRoboticsPerceptionnoVNCPAI-DSWIsaac Sim
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.