Toru Lin,
Yu Zhang*,
Qiyang Li*,
Haozhi Qi*,
Brent Yi,
Sergey Levine,
Jitendra Malik
This repo contains code, datasets, and instructions to support the following use cases:
- Collecting Demonstration Data
- Training and Evaluating Diffusion Policies
- Deploying Policies on Hardware
conda create -n hato python=3.9
conda activate hato
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -y
pip install -r ./requirements.txt
This repo supports Meta Quest 2 as the teleoperation device. To start, install oculus_reader by following instructions in the link.
We use ZMQ to handle communication between hardwares. Our data collection code is structured in the following way (credit to gello_software for the clean and modular template):
Directory / File | Detail |
---|---|
agents/ | contains Agent classes that generate low-level hardware control commands from teleoperation device or diffusion policy |
cameras/ | contains Camera classes that provide utilities to obtain real-time camera data |
robots/ | contains Robot classes that interface between Python code and physical hardwares to read observations or send low-level control commands |
*_node.py | contains ZMQ node classes for camera / robot / policy |
env.py | contains environment classes that organize the teleoperation and data collection process |
launch_nodes.py | script to launch robot hardware and sensor ZMQ nodes |
run_env.py | script to start the teleoperation and data collection process |
(Code files not mentioned contain utilities for training and evaluating diffusion policies, and deploying policies on hardware; please see the next two sections for more details.)
Currently available classes:
agents/quest_agent.py
,agents/quest_agent_eef.py
: support using Meta Quest 2 for hardware teleoperation, in either joint-space control and end-effector-space control modeagents/dp_agent.py
,agents/dp_agent_zmq.py
: support using learned diffusion policies to provide control commands, either synchronously or asynchronously (see "Deploying Policies on Hardware" section for more details)cameras/realsense_camera.py
: supports reading and preprocessing RGB-D data from RealSense camerasrobots/ur.py
,robots/ability_gripper.py
,robots/robotiq_gripper.py
: support hardware setup with a single UR5e arm or two UR5e arms, using Ability Hand or Robotiq gripper as the end-effector(s)
These classes can be flexibly modified or extended to support other teleoperation devices / learning pipelines / cameras / robot hardwares.
Example usage (note that launch_node.py
and run_env.py
should be run simultaneously in two separate windows):
# to collect data with two UR5e arms and Ability hands at 10Hz
python launch_nodes.py --robot bimanual_ur --hand_type ability
python run_env.py --agent quest_hand --no-show-camera-view --hz 10 --save_data
# to collect data with two UR5e arms and Ability hands at 10Hz, showing camera view during data collection
python launch_nodes.py --robot bimanual_ur --hand_type ability
python run_env.py --agent quest_hand --hz 10 --save_data
Node processes can be cleaned up by running pkill -9 -f launch_nodes.py
.
The linked data folder contains datasets for the four tasks featured in the HATO paper: banana
, stack
, pour
, and steak
.
Full dataset files can be unzipped using the unzip
command.
Note that the pour
and steak
datasets are split into part files because of the large file size. Before unzipping, the part files should be first concatenated back into single files using the following commands:
cat data_pour_part_0* > data_pour.zip
cat data_steak_part_0* > data_steak.zip
Scripts to download and concatenate the datasets can be found in workflow/download_dataset.sh
.
- Run
python workflow/split_data.py --base_path Traj_Folder_Path --output_path Output_Folder_Path --data_name Data_Name --num_trajs N1 N2 N3
to split the data into train and validation sets. Number of trajectories used can be specified via thenum_trajs
argument. - Run
python ./learning/dp/pipeline.py --data_path Split_Folder_Path/Data_Name --model_save_path Model_Path
to train the model, where--data_path
is the splitted trajectory folder, which is the output_path + data_name in step 1. (data_name should not include suffix like_train
or_train_10
)--model_save_path
is the path to save the model
Important Training Arguments
--batch_size
: the batch size for training.--num_epochs
: the number of epochs for training.--representation_type
: the data representation type for the model. Format:repr1--repr2--...
. Repr can beeef
,img
,depth
,touch
,pos
,hand_pos
--camera_indices
: the camera indices to use for the image data modality. Format:01
,012
,02
, etc.--train_suffix
: the suffix for the training folder. This is useful when you want to train the model on different data splits and should be used with the--num_trajs
arg ofsplit_data.py
. Format:_10
,_50
, etc.--load_img
: whether to load all the images into memory. If set toTrue
, the training will be faster but will consume more memory.--use_memmap_cache
: whether to use memmap cache for the images. If set toTrue
, it will create a memmap file in the training folder to accelerate the data loading.--use_wandb
: whether to use wandb for logging.--wandb_project_name
: the wandb project name.--wandb_entity_name
: the wandb entity name.--load_path
: the path to load the model. If set, the model will be loaded from the path and continue training. This should be the path of non-ema model.
Run python ./eval_dir.py --eval_dir Traj_Folder_Path --ckpt_path Model_Path_1 Model_Path_2
to evaluate multiple models on all trajectories in the folder.
A set of useful bash scripts can be generated using the following command:
python workflow/gen_deploy_scripts.py -c [ckpt_folder]
where ckpt_folder
is a path that contains one or more checkpoints resulted from the training above. The generated bash scripts provide the following functionalities.
-
To run policy deployment with asynchronous setup (see Section IV-D in paper for more details), first run
*_inference.sh
to launch the server, then run*_node.sh
&&*_env_jit.sh
in separate terminal windows to deploy the robot with inference server. -
To run policy deployment without asynchronous setup, run
*_node.sh
&&*_env.sh
in separate terminal windows. -
To run policy evaluation for individual checkpoints, run
*_test.sh
. Note that path to a folder containing trajectories for evaluation needs to be specified in addition to the model checkpoint path. -
To run open-loop policy test, run
*_openloop.sh
. Note that path to a demonstration data trajectory needs to be specified in addition to the model checkpoint path.
This project was developed with help from the following codebases.
If you find HATO or this codebase helpful in your research, please consider citing:
@article{lin2024learning,
author={Lin, Toru and Zhang, Yu and Li, Qiyang and Qi, Haozhi and Yi, Brent and Levine, Sergey and Malik, Jitendra},
title={Learning Visuotactile Skills with Two Multifingered Hands},
journal={arXiv:2404.16823},
year={2024}
}