Tutorial on comparing algorithm performance #747

AdamGleave · 2023-07-05T02:11:11Z

Resubmit of #739 to workaround branch permission issues. Credit to @RedTachyon for this PR

codecov · 2023-07-05T02:25:43Z

Codecov Report

Merging #747 (6677525) into master (90b6aa3) will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #747      +/-   ##
==========================================
- Coverage   96.42%   96.41%   -0.02%     
==========================================
  Files          93       93              
  Lines        8782     8782              
==========================================
- Hits         8468     8467       -1     
- Misses        314      315       +1

see 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

* Pin SB3 version to 1.7.0 (#738) * Update conftest.py (#742) * Custom environment tutorial (#746) * Custom environment tutorial draft * Update the docs website * Clean notebook * Text clarification and new environment * Decrease training duration to hopefully make CI happy * Clarify that BC itself does not learn rewards --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> * Tutorial on comparing algorithm performance (#747) * Add a new tutorial * Update index.rst * Improvements to the tutorial * Some more caution words * Fix typos --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> --------- Co-authored-by: Adam Gleave <adam@gleave.me>

* Initial version of the SQIL implementation * Pin SB3 version to 1.7.0 (#738) (#745) --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> * Tutorial on comparing algorithm performance (#747) * Add a new tutorial * Update index.rst * Improvements to the tutorial * Some more caution words * Fix typos --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> --------- Co-authored-by: Adam Gleave <adam@gleave.me> * Some documentation updates (not complete) * Add a SQIL tutorial * Reduce tutorial runtime * Add SQIL description in docs, try to add it to the right places * Fix docs * Blacken a tutorial * Reorder things in docs * Change the SQIL structure to instead subclass the replay buffer, new test * Add an empty line * Simplify the arguments * Cover another edge case, another test, fixes * Fix a circular import issue * Add a performance test - might be slow? * Fix coverage * Improve input validation * Bugfix: have set_demonstrations set rather than return * Move TransitionMapping from algorithms.base to data.types * Fix typo: expert_buffer->self.expert_buffer * Bugfix: use safe_to_numpy rather than assuming th.Tensor * Fix lint * Fix unused imports * Refactor tests * Bump # of rollouts to try to fix MacOS flakiness * Simplify SQIL example and tutorial by 1. downloading expert trajectories instead of training an expert and sampling from the expert and 2. passing trajectories instead of transitions to SQIL. * Improve docstring of SQILReplayBuffer. * Set the expert_buffer in the constructor. * Consistently set expert transition reward to 1 and learner transition reward to 0 when adding them to the SQILReplayBuffer instead of modifying them on-the-fly when sampling. * Fix docstring of SQILReplayBuffer.sample() * Switch back to the CartPole-v1 environment in the SQIL examples * Only train for 1k steps in the SQIL example so the doctests don't run for too long. * Fix cell metadata for tutorial notebook. * Notebook formatting fixes. * Fix typing error in SQIL implementation. * Fix isort issue. * Clarify that our variant of the SQIL implementation is not really "soft". * Fix link in experts documentation. * Remove support for transition mappings. * Remove data_loader from SQIL test cases. * Bump number of demonstrations in SQIL performance test to reduce flakiness. * Adapt hyperparameters in test_sqil_performance to reduce flakiness * Fix seeds for flaky test_sqil_performance * Increase coverage in test_sqil.py * Pass kwargs to SQIL.train to DQN.learn - also set default tb_log_name to "SQIL" * Pass parameters as kwargs for multi-ary methods in sqil.py * Make test for exceptions raised by SQIL constructor more specific - also: adjust imports to conform with style guide --------- Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Maximilian Ernestus <maximilian@ernestus.de> Co-authored-by: Jason Hoelscher-Obermaier <jason.hoelscherobermaier@gmail.com>

RedTachyon and others added 6 commits July 4, 2023 01:13

Add a new tutorial

641b51d

Update index.rst

d5aa57e

Merge branch 'HumanCompatibleAI:master' into baselines-tutorial

4fec09f

Improvements to the tutorial

bb994b3

Some more caution words

6d4eea0

Fix typos

e5a3f7e

AdamGleave mentioned this pull request Jul 5, 2023

Tutorial on comparing algorithm performance #739

Closed

Merge branch 'master' into baselines-tutorial

6677525

AdamGleave merged commit 688e163 into master Jul 5, 2023

AdamGleave deleted the baselines-tutorial branch July 5, 2023 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial on comparing algorithm performance #747

Tutorial on comparing algorithm performance #747

AdamGleave commented Jul 5, 2023

codecov bot commented Jul 5, 2023 •

edited

Loading

Tutorial on comparing algorithm performance #747

Tutorial on comparing algorithm performance #747

Conversation

AdamGleave commented Jul 5, 2023

codecov bot commented Jul 5, 2023 • edited Loading

Codecov Report

codecov bot commented Jul 5, 2023 •

edited

Loading