The focus of this project is to apply Deep Reinforcement Learning to acquire a robust policy that allows robots to grasp diverse objects from compact 3D observations in form of octrees. It is part of my Master's Thesis conducted at Aalborg University, Denmark.
Below are some animations of employing learned policies on novel scenes for Panda and UR5 robots.
Example of Sim2Real transfer on UR5 can be seen below (trained inside simulation, no re-training in real world).
Local Installation (click to expand)
If you just want to try this project without lengthy installation, consider using Docker instead.
- OS: Ubuntu 20.04 (Focal)
- Others might work, but they were not tested.
- GPU: CUDA is required to process octree observations on GPU.
- Everything else should function normally on CPU, i.e. environments with other observation types.
These are the primary dependencies required to use this project.
- Python 3.8
- ROS 2 Foxy OR Rolling (recommended)
- Ignition Dome OR Fortress (recommended)
- MoveIt 2
- Install/build a version based on the selected ROS 2 release
- ros_ign
- Install/build a version based on the selected combination of ROS 2 release and Ignition version
- gym-ignition
- AndrejOrsula/gym-ignition fork is currently required
- O-CNN
- AndrejOrsula/O-CNN fork is currently required
- PyTorch (last tested on 1.9.1)
- Stable-Baselines3 (last tested on 1.2.0) and sb3-contrib
Python dependencies are listed under python_requirements.txt. All of these (including Pytorch and Stable-Baselines3) can be installed via pip
.
pip3 install -r python_requirements.txt
Dependencies for robot models (e.g. panda_ign/panda_moveit2_config) and interaction between MoveIt 2 and Ignition (ign_moveit2) are pulled from git and built together with this repository, see drl_grasping.repos for more details.
In case you run into any problems with dependencies along the way, please check Dockerfile that includes the full instructions.
Clone this repository and import VCS dependencies. Then build with colcon.
# Create workspace for the project
mkdir -p drl_grasping/src && cd drl_grasping/src
# Clone this repository
git clone https://github.com/AndrejOrsula/drl_grasping.git
# Import and install dependencies
vcs import < drl_grasping/drl_grasping.repos && cd ..
rosdep install -r --from-paths src -i --rosdistro ${ROS_DISTRO}
# Build with colcon
colcon build --merge-install --symlink-install --cmake-args "-DCMAKE_BUILD_TYPE=Release"
Use
git clone --recursive https://github.com/AndrejOrsula/drl_grasping.git
if you wish to use one of the pre-trained agents.
Docker (click to expand)
- OS: Any system that supports Docker should work (Linux, Windows, macOS).
- Only Ubuntu 20.04 was tested.
- GPU: CUDA is required to process octree observations on GPU. Therefore, only Docker images with CUDA support are currently available, however, it should be possible to use the pre-built image even on systems without a dedicated GPU.
Before starting, make sure your system has a setup for using Nvidia Docker, e.g.:
# Docker
curl https://get.docker.com | sh \
&& sudo systemctl --now enable docker
# Nvidia Docker
distribution=$(. /etc/os-release; echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
The easiest way to try out this project is by using a pre-built Docker image that can be pulled from Docker Hub. Currently, there is only a development image available that also contains the default testing datasets (huge, but it is easy to use and allows editing and recompiling). You can pull the latest
tag with the following command. Alternatively, each release has also its associated tag, e.g. 1.0.0
.
docker pull andrejorsula/drl_grasping:latest
For running of the container, please use the included docker/run.bash script that is included with this repo. It significantly simplifies the setup with volumes and allows use of graphical interfaces for Ignition Gazebo GUI client and RViZ.
<drl_grasping dir>/docker/run.bash andrejorsula/drl_grasping:latest /bin/bash
If desired, you can also run examples and scripts directly with this setup, e.g. enjoying of pre-trained agents discussed below.
<drl_grasping dir>/docker/run.bash andrejorsula/drl_grasping:latest ros2 run drl_grasping ex_enjoy_pretrained_agent.bash
If you are struggling to get CUDA working on your system with Nvidia GPU (no
nvidia-smi
output), you might need to use a different version of CUDA base image that supports the version of your driver. If that is the case, you need to build yourself a new Docker image.
Dockerfile is included with this repo but all source code is pulled from GitHub when building an image. There is nothing special about it, so just build it as any other Dockerfile (docker build . -t ...
) and adjust arguments or the recipe itself if needed.
Sourcing of the Workspace Overlay (click to expand)
Before running any commands, remember to source the ROS 2 workspace overlay. You can skip this step for Docker build as it is done automatically inside the entrypoint.
source <drl_grasping dir>/install/local_setup.bash
This enables:
- Use of
drl_grasping
Python module - Execution of scripts and examples via
ros2 run drl_grasping <executable>
- Launching of setup scripts via
ros2 launch drl_grasping <launch_script>
Environment variable DRL_GRASPING_DEBUG_LEVEL
can be set DEBUG
/INFO
/WARN
/ERROR
/DISABLED
to affect the level of logging for environments.
Using Pre-trained Agents (click to expand)
The pretrained_agents submodule contains a selection of few agents that are already trained and ready to be enjoyed (remember to git clone --recursive
/git submodule update --init
if you wish to use these). To use them, you can use ex_enjoy_pretrained_agent.bash
. You should see RViZ 2 and Ignition Gazebo GUI client with an agent trying to grasp one of four objects in a fully randomised novel environment, while the performance of the agent is logged in your terminal.
ros2 run drl_grasping ex_enjoy_pretrained_agent.bash
The default agent is for Grasp-OctreeWithColor-Gazebo-v0
environment with Panda robot and TQC. You can modify these to any of the other pre-trained agent directly in the example script according to the support matrix from AndrejOrsula/drl_grasping_pretrained_agents.
Under the hood, all examples launch a setup ROS 2 script for interfacing MoveIt 2 and Ignition, and a corresponding Python script for enjoying or training. All examples print these commands out if you are interested in running the commands separately.
Training New Agents (click to expand)
To train your own agent, you can start with the ex_train.bash
example. You can customise this example script, configuration of the environment and all hyperparameters to your needs (see below). By default, headless mode is used during training to reduce computational load. If you want to see what is going on, use ign gazebo -g
or ROS_DOMAIN_ID=69 rviz2
(ROS_DOMAIN_ID=69
is default for Docker image).
ros2 run drl_grasping ex_train.bash
Depending on your hardware and hyperparameter configuration, the training can be a very lengthy process. It takes nearly three days to train an agent for 500k steps on a 130W laptop with a dedicated GPU.
To enjoy an agent that you have trained yourself, look into ex_enjoy.bash
example. Similar to training, change the environment ID, algorithm and robot model. Furthermore, select a specific checkpoint that you want to run. RViZ 2 and Ignition Gazebo GUI client are enabled by default.
ros2 run drl_grasping ex_enjoy.bash
This repository contains environments for robotic manipulation that are compatible with OpenAI Gym. All of these make use of Ignition Gazebo robotic simulator, which is interfaced via Gym-Ignition.
Currently, the following environments are included inside this repository. Take a look at their gym environment registration and source code if you are interested in configuring them. There is a lot of parameters trying different RL approaches and techniques, so it is currently a bit messy (might get cleaned up if I have some free time for it).
- Grasp task (the focus of this project)
- Observation variants
- GraspOctree, with and without color features
- GraspColorImage (RGB image) and GraspRgbdImage (RGB-D image) are implemented on image_obs branch. However, their implementation is currently only for testing and comparative purposes.
- Curriculum Learning: Task includes GraspCurriculum, which can be used to progressively increase difficulty of the task by automatically adjusting the following environment parameters based on the current success rate.
- Workspace size
- Number of objects
- Termination state (task is divided into hierarchical sub-tasks with aim to further guide the agent).
- This part does not bring any improvements based on experimental results, so do not bother using it.
- Demonstrations: Task contains a simple scripted policy that can be applied to collect demonstrations, which can then be used to pre-load a replay buffer for training with off-policy RL algorithms.
- It provides a slight increase for early learning, however, experiments indicate that it degrades the final success rate (probably due to introduction of bias early on). Therefore, do not use demonstrations if possible, at least not with this environment.
- Observation variants
- Reach task (a simplistic environment for testing stuff)
- Observation variants
- Reach - simulation states
- ReachColorImage
- ReachDepthImage
- ReachOctree, with and without color features
- Observation variants
These environments can be wrapped by a randomizer in order to introduce domain randomization and improve generalization of the trained policies, which is especially beneficial for Sim2Real transfer.
The included ManipulationGazeboEnvRandomizer allows randomization of the following properties at each reset of the environment.
- Object model - primitive geometry
- Random type (box, sphere and cylinder are currently supported)
- Random color, scale, mass, friction
- Object model - mesh geometry
- Random type (see Object Model Database)
- Random scale, mass, friction
- Object pose
- Ground plane texture
- Initial robot configuration
- Camera pose
For dataset of objects with mesh geometry and material texture, this project utilizes Google Scanned Objects collection from Ignition Fuel. You can also try to use a different Fuel collection or just a couple of models stored locally (although some tweaks might be required to support certain models).
All models are automatically configured in several ways before their insertion into the world:
- Inertial properties are automatically estimated (uniform density is assumed)
- Collision geometry is decimated in order to improve performance
- Models can be filtered and automatically blacklisted based on several aspects, e.g too much geometry or disconnected components
This repository includes few scripts that can be used to simplify interaction with the dataset and splitting into training/testing subsets. By default they include 80 training and 20 testing models.
dataset_download_train
/dataset_download_test
- Download models from Fueldataset_unset_train
/dataset_unset_test
- Unset current train/test datasetdataset_set_train
/dataset_set_test
- Set dataset to use train/test subsetprocess_collection
- Process the collection with the steps mentioned above
DRL_GRASPING_PBR_TEXTURES_DIR
environment variable can be exported if ground plane texture should be randomized. It should lead to a directory with the following structure.
├── ./ # Directory pointed to by `DRL_GRASPING_PBR_TEXTURES_DIR`
├── texture_0
├── *albedo*.png || *basecolor*.png
├── *normal*.png
├── *roughness*.png
└── *specular*.png || *metalness*.png
├── ...
└── texture_n
There are several databases with free PBR textures that you can use. Alternatively, you can clone AndrejOrsula/pbr_textures with 80 training and 20 testing textures.
Only Franka Emika Panda, UR5 with RG2 gripper and Kinova Gen2 (j2s7s300) are supported. This project currently lacks a more generic solution that would allow to easily utilize arbitrary models, e.g. full-on MoveIt 2 with ros2_control implementation. Adding new models is not complicated though, just time-consuming.
This project makes direct use of stable-baselines3 as well as sb3_contrib. Furthermore, scripts for training and evaluation are largely inspired by rl-baselines3-zoo.
The OctreeCnnFeaturesExtractor makes use of O-CNN implementation to enable training on GPU. This features extractor is part of OctreeCnnPolicy
policy that is currently implemented for TD3, SAC and TQC algorithms. Network architecture of this feature extractor is illustrated below.
Hyperparameters for training of RL agents can be found in hyperparams directory. Optuna was used to autotune some of them, but certain algorithm/environment combinations require far more tuning (especially TD3). If needed, you can try running Optuna yourself, see ex_optimize
example.
├── drl_grasping # Primary Python module of this project
├── algorithms # Definitions of policies and slight modifications to RL algorithms
├── envs # Environments for grasping (compatible with OpenAI Gym)
├── tasks # Tasks for the agent that are identical for simulation
├── randomizers # Domain randomization of the tasks, which also populates the world
└── models # Functional models for the environment (Ignition Gazebo)
├── control # Control for the agent
├── perception # Perception for the agent
└── utils # Other utilities, used across the module
├── examples # Examples for training and enjoying RL agents
├── hyperparams # Hyperparameters for training RL agents
├── scripts # Helpful scripts for training, evaluating, ...
├── launch # ROS 2 launch scripts that can be used to help with setup
├── docker # Dockerfile for this project
└── drl_grasping.repos # List of other dependencies created for `drl_grasping`
In case you have any problems or questions, feel free to open an Issue or a Discussion.