Bayesian Inverse Reinforcemen Learning

Enviornment is the figure1 in the birl paper

Tested on

python==3.7.0
numpy==1.15.1
scipy==1.1.0
tqdm==4.26.0
matplotlib==2.2.3

python src/birl.py

Sampled rewards for each states.
An optimal policy for mean of sampled rewards were exactly matched with the expert's policy.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
results		results
src		src
test		test
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt