Add SQIL description in docs, try to add it to the right places

HumanCompatibleAI · Jul 6, 2023 · bf81940 · bf81940
1 parent c303af1
commit bf81940
Show file tree

Hide file tree

Showing 2 changed files with 11 additions and 3 deletions.
diff --git a/docs/algorithms/sqil.rst b/docs/algorithms/sqil.rst
@@ -1,10 +1,16 @@
 .. _soft q imitation learning docs:
 
-=======================
+================================
 Soft Q Imitation Learning (SQIL)
-=======================
+================================
 
-<add description of SQIL>
+Soft Q Imitation learning learns to imitate a policy from demonstrations by
+using the DQN algorithm with modified rewards. During each policy update, half
+of the batch is sampled from the demonstrations and half is sampled from the
+environment. Expert demonstrations are assigned a reward of 1, and the
+environment is assigned a reward of 0. This encourages the policy to imitate
+the demonstrations, and to simultaneously avoid states not seen in the
+demonstrations.
 
 Example
 =======

diff --git a/docs/index.rst b/docs/index.rst
@@ -60,6 +60,7 @@ If you use ``imitation`` in your research project, please cite our paper to help
  algorithms/density
  algorithms/mce_irl
  algorithms/preference_comparisons
+ algorithms/sqil
 
 .. toctree::
  :maxdepth: 2
@@ -76,6 +77,7 @@ If you use ``imitation`` in your research project, please cite our paper to help
  tutorials/7_train_density
  tutorials/8_train_custom_env
  tutorials/9_compare_baselines
+ tutorials/10_train_sqil
  tutorials/trajectories
 
 .. toctree::