Implement Generative Adversarial Imitation Learning (GAIL) #550

nuance1979 · 2022-03-02T22:29:01Z

I have marked all applicable categories:
- exception-raising fix
- algorithm implementation fix
- documentation modification
- new feature
I have reformatted the code using make format (required)
I have checked the code using make commit-checks (required)
If applicable, I have mentioned the relevant/related issue(s)
If applicable, I have listed every items in this Pull Request below

Implement GAIL based on PPO and provide example script and sample (i.e., most likely not the best) results with Mujoco tasks. (#531, #173)

codecov-commenter · 2022-03-02T23:43:52Z

Codecov Report

Merging #550 (77ae3c9) into master (d976a5a) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #550      +/-   ##
==========================================
- Coverage   93.82%   93.81%   -0.01%     
==========================================
  Files          63       64       +1     
  Lines        4322     4368      +46     
==========================================
+ Hits         4055     4098      +43     
- Misses        267      270       +3

Flag	Coverage Δ
unittests	`93.81% <100.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
tianshou/policy/__init__.py	`100.00% <100.00%> (ø)`
tianshou/policy/imitation/gail.py	`100.00% <100.00%> (ø)`
tianshou/policy/modelfree/trpo.py	`88.52% <0.00%> (-4.92%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d976a5a...77ae3c9. Read the comment docs.

tianshou/policy/imitation/gail.py

tianshou/utils/net/continuous.py

examples/atari/atari_wrapper.py

examples/atari/README.md

examples/atari/atari_wrapper.py

tianshou/policy/imitation/gail.py

tianshou/utils/net/continuous.py

Implement GAIL based on PPO and provide example script and sample (i.e., most likely not the best) results with Mujoco tasks. (thu-ml#531, thu-ml#173)

Yi Su added 11 commits March 3, 2022 05:46

implement GAIL policy

70ef40a

fix code format

00614ec

update atari_gail.py to latest version

bf0b33b

update Discriminator net arch

427f18f

fix a bug about when to compute rewards

173e40b

fix a bug about tensor shape

1025a58

add gail example for Mujoco tasks

ff15c38

add gail mujoco results

db69ddc

make linter happy

2acd76b

make mypy happy

12f880b

make spell checker happy

8938903

add unit test

9faa20d

This was linked to issues Mar 3, 2022

Inverse reinforcement learning #173

Closed

Is there any bc or gail in the tianshou #531

Closed

Trinkle23897 reviewed Mar 3, 2022

View reviewed changes

tianshou/policy/imitation/gail.py Outdated Show resolved Hide resolved

tianshou/utils/net/continuous.py Outdated Show resolved Hide resolved

examples/atari/atari_wrapper.py Outdated Show resolved Hide resolved

Merge branch 'master' into gail

47f0e72

Trinkle23897 reviewed Mar 5, 2022

View reviewed changes

examples/atari/README.md Outdated Show resolved Hide resolved

examples/atari/atari_wrapper.py Outdated Show resolved Hide resolved

tianshou/policy/imitation/gail.py Outdated Show resolved Hide resolved

tianshou/utils/net/continuous.py Outdated Show resolved Hide resolved

Yi Su and others added 6 commits March 6, 2022 13:29

refactor GAIL implementation based on feedback

91fcf5a

revert changes

4a3a5ea

update unit test

20eb2e7

make mypy happy

9df147f

fix ci

4313d1b

make ci faster

77ae3c9

Trinkle23897 approved these changes Mar 6, 2022

View reviewed changes

Trinkle23897 merged commit 2377f2f into thu-ml:master Mar 6, 2022

nuance1979 deleted the gail branch April 24, 2022 21:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Generative Adversarial Imitation Learning (GAIL) #550

Implement Generative Adversarial Imitation Learning (GAIL) #550

nuance1979 commented Mar 2, 2022

codecov-commenter commented Mar 2, 2022 •

edited

Loading

Implement Generative Adversarial Imitation Learning (GAIL) #550

Implement Generative Adversarial Imitation Learning (GAIL) #550

Conversation

nuance1979 commented Mar 2, 2022

codecov-commenter commented Mar 2, 2022 • edited Loading

Codecov Report

codecov-commenter commented Mar 2, 2022 •

edited

Loading