RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
CoRL 2024 (Oral Presentation)
This is the official code release of RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation.
-
Create conda environment and install pytorch
This code is tested on Python 3.8.19 on Ubuntu 20.04, with PyTorch 2.0.1+cu118:
conda create -n ram python=3.8 conda activate ram # pytorch 2.0.1 with cuda 11.8 pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
-
Grounded-SAM
Install dependencies and download the checkpoints:
pip install -e vision/GroundedSAM/GroundingDINO pip install -e vision/GroundedSAM/segment_anything wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth -P assets/ckpts/ wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth -P assets/ckpts/
-
GSNet
First, download the pretrained checkpoints and put the
.tar
file intoassets/ckpts/
. We useminkuresunet_kinect.tar
by default.# MinkowskiEngine, this may take a while git clone git@github.com:NVIDIA/MinkowskiEngine.git cd MinkowskiEngine conda install openblas-devel -c anaconda python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas ## pointnet2 & graspnetAPI cd graspness_implementation pip install -r requirements.txt cd pointnet2 python setup.py install cd .. cd graspnetAPI pip install . pip install "numpy<1.24" pip install pytorch-utils
If you want to use close-sourced AnyGrasp as an alternative, please follow anygrasp_sdk to setup the SDK and put the
checkpoint_detection.tar
checkpoint toassets/ckpts/
. Andgsnet.so
,lib_cxx.so
, andlicense/
should be in the project root directory. -
pointnet2_ops
# this may take a while git clone git@github.com:erikwijmans/Pointnet2_PyTorch.git cd Pointnet2_PyTorch/pointnet2_ops_lib pip install -e .
-
Other requirements
pip install -r requirements.txt
-
(Optional) Retrieval data
If you want to use the retrieval pipeline, please download the retrieval data from Google Drive and unzip the data to
assets/data/
.
Run commands below to run the demo:
export PYTHONPATH=$PWD
python run_realworld/run.py --config configs/drawer_open.yaml # add --retrieve to enable retrieval
After finished, you shall see printed 3D affordance results w/ grasp and visualization at run_realworld/gym_outputs/drawer_open/
like below:
- Release the method code and demo.
- Release the retrieval pipeline and data.
- More to come... (Feel free to open issues and PRs!)
Please stay tuned for any updates of the dataset and code!
We thank authors of dift, GeoAware-SC, graspness_implementation and Grounded-Segment-Anything for their great work and open-source spirit!
If you find this work helpful, please consider citing:
@article{kuang2024ram,
title={RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation},
author={Kuang, Yuxuan and Ye, Junjie and Geng, Haoran and Mao, Jiageng and Deng, Congyue and Guibas, Leonidas and Wang, He and Wang, Yue},
journal={arXiv preprint arXiv:2407.04689},
year={2024}
}