This is a forked repo to organize the official ipynb implementation of SayCan (Do As I Can, Not As I Say: Grounding Language in Robotic Affordances) for easier further research.
This fork enables a cli-like layer and also enables support for open-source llms like llama-3.1 supported via the transformers library.
This repository has been tested to be working with Ubuntu 22.04.2 and pip3==20.2.3 on the conda environment (instructions below).
Clone this repo. Create and activate new conda environment with python 3.9 as follows.
conda create -n saycan python=3.9.1
conda activate saycan
pip3 install pip==20.2.3
pip3 install -r requirements.txt
Ensure that you setup vllm and register your huggingface token.
The instructions below try downloading from official sources. If there are
any problems there, I also host the assets/ directory via this shared link.
Simply download it and unzip it in the project root directory: saycan.ROOT_DIR
If you still have issues (eg. broken links), email me by finding my email on my personal webpage rushangkaria.github.io
mkdir assets/
gdown -O assets/ 1Cc_fDSBL6QiDvNT4dpfAEbhbALSVoWcc
gdown -O assets/ 1yOMEm-Zp_DL3nItG9RozPeJAmeOldekX
gdown -O assets/ 1GsqNLhEl9dd4Mc3BM0dX3MibOI1FVWNM
unzip assets/ur5e.zip -d assets/
unzip assets/robotiq_2f_85.zip -d assets/
unzip assets/bowl.zip -d assets/
gsutil cp -r gs://cloud-tpu-checkpoints/detection/projects/vild/colab/image_path_v2 assets/
You can skip this process if you want to generate data by yourself with gen_data.py.
Download pregenerated dataset by running
gdown -O assets/ 1yCz6C-6eLWb4SFYKdkM-wz5tlMjbG2h8
gdown -O assets/ 1Nq0q1KbqHOA5O7aRSu4u7-u27EMMXqgP
Don't forget to add your openai key in llm.py.
If you have downloaded the pretrained policy in 2.4, you can now run demo.py to visualize the evaluation process.
If you want to train a model from scratch, run train.py.