Fine‑Tuning GR00T‑N1.5: From Human Demonstrations to Distributed Imitation Learning

This tutorial walks through fine‑tuning the complex VLA model GR00T‑N1.5 by collecting human demonstrations, annotating and massively augmenting data with DLC, performing distributed imitation learning, and validating the model through a server‑client DSW setup, complete with code snippets, resource specs, and visual examples.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Fine‑Tuning GR00T‑N1.5: From Human Demonstrations to Distributed Imitation Learning

Overview

This guide demonstrates an end‑to‑end pipeline for fine‑tuning the VLA model GR00T‑N1.5‑3B using the RobotLearningLab public dataset. The workflow includes human demonstration collection, dataset annotation, large‑scale synthetic data generation with DLC, distributed augmentation via Ray, data merging and conversion to Lerobot format, distributed imitation learning, and a server‑client closed‑loop evaluation.

Environment Setup

Launch a DSW instance with a Docker image that contains Isaac Lab 2.2 and its dependencies:

dsw-registry-vpc.cn-beijing.cr.aliyuncs.com/pai-training-algorithm/isaac-sim:isaaclab220-nb4-v7-20250916

Mount the public dataset at /mnt/RobotLearningLab_Dataset to avoid repeated downloads. Recommended instance types (Beijing region) are:

ecs.ebmgn8is.32xlarge

ecs.gn8is-8x.32xlarge

ecs.ebmgn8te.32xlarge

ecs.ebmgn9t.48xlarge

Human Demonstration

Start a VNC server inside the DSW container:

/opt/TurboVNC/bin/vncserver :0 -geometry 3840x2160

From a local terminal, create an SSH tunnel to the DSW instance (replace <DSW_IP> and <DSW_PORT> with the instance’s public IP and port):

ssh -L 5900:127.0.0.1:5900 root@<DSW_IP> -p <DSW_PORT>

Record demonstrations with the Isaac Lab script (teleoperation device can be spacemouse or keyboard):

mkdir -p /mnt/data/isaac_tmp/nb4/datasets
cd /workspace/RobotLearningLab && ./isaaclab.sh -p usecase/scripts/record_demos.py \
  --task Isaac-Stack-Cube-Galbot-Left-Arm-RmpFlow-Rel-v0 \
  --teleop_device keyboard \
  --dataset_file /mnt/data/isaac_tmp/nb4/datasets/dataset.hdf5 \
  --num_demos 10

Key bindings during demonstration:

Reset: R Toggle gripper: K Move X axis: W/S Move Y axis: A/D Move Z axis: Q/E Rotate X axis: Z/X Rotate Y axis: T/G Rotate Z axis:

C/V

Data Annotation

Annotate the collected dataset.hdf5 to add sub‑task labels:

output_path_str=$EXTERNAL_STORAGE_PATH/datasets
annotate_command="cd $ROBOT_LEARNING_LAB_PATH && \
./isaaclab.sh -p usecase/scripts/annotate_demos.py \
--task Isaac-Stack-Cube-Galbot-Left-Arm-RmpFlow-Abs-Mimic-v0 \
--device cuda \
--auto \
--input_file $output_path_str/dataset.hdf5 \
--output_file $output_path_str/dataset_annotate.hdf5 \
--headless"
!$annotate_command

Large‑Scale Data Augmentation

Generate synthetic trajectories with DLC:

generate_command="cd $ROBOT_LEARNING_LAB_PATH && \
./isaaclab.sh -p usecase/scripts/generate_dataset.py \
--task Isaac-Stack-Cube-Galbot-Left-Arm-RmpFlow-Abs-Mimic-v0 \
--device cuda \
--num_envs 10 \
--generation_num_trials 10000 \
--input_file $output_path_str/dataset_annotate.hdf5 \
--output_file $output_path_str/dataset_generate.hdf5 \
--headless"
!$generate_command

Important flags: --num_envs 10: run 10 parallel environments. --generation_num_trials 10000: target 10 000 successful trajectories. --device cuda: GPU acceleration. --headless: no GUI.

Distributed Augmentation with Ray

Create a Ray task that distributes the generation across multiple nodes:

/workspace/RobotLearningLab/isaaclab.sh -p /mnt/data/isaac_tmp/nb4/datasets/ray_isaac_new.py \
  --command "cd /workspace/RobotLearningLab && \
  ./isaaclab.sh -p /mnt/data/isaac_tmp/nb4/datasets/generate_dataset_ray.py \
  --task Isaac-Stack-Cube-Galbot-Left-Arm-RmpFlow-Abs-Mimic-v0 \
  --device cuda \
  --num_envs 10 \
  --generation_num_trials 625 \
  --input_file /mnt/data/isaac_tmp/nb4/datasets/dataset_annotate.hdf5 \
  --output_file /mnt/data/isaac_tmp/nb4/datasets/dataset_generate.hdf5 \
  --headless" \
  --gpu 1 --cpu 10 --memory 80 --num_per_worker 8

Resource layout per worker:

CPU: 10 cores

Memory: 80 GB

GPU: 1

Tasks per worker: 8

Data Processing

Merge the HDF5 shards produced by DLC, replay trajectories to generate videos, and convert the dataset to Lerobot joint‑space format.

# Merge successful shards
python merge_hdf5_datasets.py --input_files $(ls $EXTERNAL_STORAGE_PATH/datasets/*_*.hdf5) --output_file merged_dataset.hdf5

# Replay for video generation
replay_command="cd $ROBOT_LEARNING_LAB_PATH && \
./isaaclab.sh -p usecase/scripts/replay_demos_with_camera.py \
--task Isaac-Stack-Cube-Galbot-Left-Arm-Image-Based-v0 \
--dataset_file $output_path_str/dataset_generate.hdf5 \
--num_envs 10 --video --video_path $output_path_str \
--camera_view_list ego left_wrist right_wrist --headless"
!$replay_command

# Convert to Lerobot format
convert_cmd="cd $ROBOT_LEARNING_LAB_PATH && \
./isaaclab.sh -p benchmarks/gr00t/convert_hdf5_to_lerobot_joint_space.py \
--data_root $output_path \
--hdf5_filename dataset_generate.hdf5 \
--hdf5_file_path $output_path/dataset_generate.hdf5 \
--lerobot_data_dir $output_path/lerobot_joint_space"
!$convert_cmd

Distributed Imitation Learning

Fine‑tune the GR00T‑N1.5‑3B checkpoint on the augmented Lerobot dataset using two GPUs:

cd /mnt/data/isaac_tmp/nb4/Isaac-GR00T && \
export WANDB_MODE=offline && NCCL_P2P_DISABLE=1 NCCL_IB_DISABLE=1 \
python scripts/gr00t_finetune.py \
  --base_model_path /mnt/data/isaac_tmp/nb4/GR00T-N1.5-3B \
  --dataset-path /mnt/data/isaac_tmp/nb4/datasets/lerobot_joint_space \
  --num-gpus 2 \
  --batch-size 2 \
  --output-dir /mnt/data/isaac_tmp/nb4/datasets/joint_space_2_2 \
  --max-steps 40000 \
  --data-config galbot_joint_space \
  --video-backend decord \
  --no-tune-visual

Key flags: --base_model_path: path to the pre‑trained GR00T‑N1.5‑3B checkpoint. --dataset-path: directory containing the augmented Lerobot data. --num-gpus 2: number of GPUs for training (adjustable). --no-tune-visual: keep the visual encoder frozen.

Closed‑Loop Evaluation (Server‑Client DSW)

Server side – launch an inference server that hosts the fine‑tuned model:

cd /mnt/data/isaac_tmp/nb4/Isaac-GR00T && \
python gr00t_inference_server.py --port 5555 \
  --model_path /mnt/data/isaac_tmp/nb4/checkpoint-40000 \
  --data_config galbot_joint_space

Obtain the private IP of the server instance (example 10.0.0.207) for the client to connect:

PRI_IP=$(ifconfig eth1 | grep 'inet ' | awk '{print $2}') && echo "My private IP is: $PRI_IP"

Client side – run an inference client that connects to the server and executes the stacked‑cube task:

cd /workspace/RobotLearningLab && ./isaaclab.sh -p benchmarks/gr00t/gr00t_inference_client.py \
  --server_port 5555 --server_host 10.0.0.207 \
  --num_total_experiments 100 --num_success_steps 8 \
  --policy_type joint_space \
  --task Isaac-Stack-Cube-Galbot-Left-Arm-Joint-Position-Image-Based-v0

The client visualizes the robot performing the task under the guidance of the fine‑tuned policy.

data augmentationPhysical AIDSWDistributed Imitation LearningGR00T
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.