Skip to content

Commit

Permalink
Add SQIL description in docs, try to add it to the right places
Browse files Browse the repository at this point in the history
  • Loading branch information
RedTachyon committed Jul 6, 2023
1 parent c303af1 commit bf81940
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 3 deletions.
12 changes: 9 additions & 3 deletions docs/algorithms/sqil.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
.. _soft q imitation learning docs:

=======================
================================
Soft Q Imitation Learning (SQIL)
=======================
================================

<add description of SQIL>
Soft Q Imitation learning learns to imitate a policy from demonstrations by
using the DQN algorithm with modified rewards. During each policy update, half
of the batch is sampled from the demonstrations and half is sampled from the
environment. Expert demonstrations are assigned a reward of 1, and the
environment is assigned a reward of 0. This encourages the policy to imitate
the demonstrations, and to simultaneously avoid states not seen in the
demonstrations.

Example
=======
Expand Down
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ If you use ``imitation`` in your research project, please cite our paper to help
algorithms/density
algorithms/mce_irl
algorithms/preference_comparisons
algorithms/sqil

.. toctree::
:maxdepth: 2
Expand All @@ -76,6 +77,7 @@ If you use ``imitation`` in your research project, please cite our paper to help
tutorials/7_train_density
tutorials/8_train_custom_env
tutorials/9_compare_baselines
tutorials/10_train_sqil
tutorials/trajectories

.. toctree::
Expand Down

0 comments on commit bf81940

Please sign in to comment.