Skip to content

sail-sg/ILD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Imitation Learning via Differentiable Physics

Existing imitation learning (IL) methods such as inverse reinforcement learning (IRL) usually have a double-loop training process, alternating between learning a reward function and a policy and tend to suffer long training time and high variance. In this work, we identify the benefits of differentiable physics simulators and propose a new IL method, i.e., Imitation Learning via Differentiable Physics (ILD), which gets rid of the double-loop design and achieves significant improvements in final performance, convergence speed, and stability.

[paper] [code]

Brax MuJoCo Tasks

Our ILD agent learns using a single expert demonstration with much less variance and higher performance.

results

Expert Demo Learned Policy

Cloth Manipulation Task

We collect a single expert demonstration in a noise-free environment. Despite the presence of severe control noise in the test environment, our method completes the task and recovers the expert behavior.

Expert Demo (Noise-free) Learned Policy (Heavy Noise in Control)

Installation

conda create -n ILD python==3.8
conda activate ILD

pip install --upgrade pip
pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install brax
pip install streamlit
pip install tensorflow
pip install open3d

Start training

# train with a single demonstration
cd policy/brax_task
CUDA_VISIBLE_DEVICES=0 python train_on_policy.py --env="ant" --seed=1

# train with multiple demonstrations
cd policy/brax_task
CUDA_VISIBLE_DEVICES=0 python train_multi_traj.py --env="ant" --seed=1

# train with cloth manipulation task
cd policy/cloth_task
CUDA_VISIBLE_DEVICES=0 python train_on_policy.py --seed=1

Releases

No releases published

Packages

No packages published

Languages