-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tutorial on comparing algorithm performance #747
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## master #747 +/- ##
==========================================
- Coverage 96.42% 96.41% -0.02%
==========================================
Files 93 93
Lines 8782 8782
==========================================
- Hits 8468 8467 -1
- Misses 314 315 +1 see 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
RedTachyon
added a commit
that referenced
this pull request
Jul 6, 2023
* Pin SB3 version to 1.7.0 (#738) * Update conftest.py (#742) * Custom environment tutorial (#746) * Custom environment tutorial draft * Update the docs website * Clean notebook * Text clarification and new environment * Decrease training duration to hopefully make CI happy * Clarify that BC itself does not learn rewards --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> * Tutorial on comparing algorithm performance (#747) * Add a new tutorial * Update index.rst * Improvements to the tutorial * Some more caution words * Fix typos --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> --------- Co-authored-by: Adam Gleave <adam@gleave.me>
ernestum
added a commit
that referenced
this pull request
Aug 10, 2023
* Initial version of the SQIL implementation * Pin SB3 version to 1.7.0 (#738) (#745) --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> * Tutorial on comparing algorithm performance (#747) * Add a new tutorial * Update index.rst * Improvements to the tutorial * Some more caution words * Fix typos --------- Co-authored-by: Ariel Kwiatkowski <ariel.j.kwiatkowski@gmail.com> --------- Co-authored-by: Adam Gleave <adam@gleave.me> * Some documentation updates (not complete) * Add a SQIL tutorial * Reduce tutorial runtime * Add SQIL description in docs, try to add it to the right places * Fix docs * Blacken a tutorial * Reorder things in docs * Change the SQIL structure to instead subclass the replay buffer, new test * Add an empty line * Simplify the arguments * Cover another edge case, another test, fixes * Fix a circular import issue * Add a performance test - might be slow? * Fix coverage * Improve input validation * Bugfix: have set_demonstrations set rather than return * Move TransitionMapping from algorithms.base to data.types * Fix typo: expert_buffer->self.expert_buffer * Bugfix: use safe_to_numpy rather than assuming th.Tensor * Fix lint * Fix unused imports * Refactor tests * Bump # of rollouts to try to fix MacOS flakiness * Simplify SQIL example and tutorial by 1. downloading expert trajectories instead of training an expert and sampling from the expert and 2. passing trajectories instead of transitions to SQIL. * Improve docstring of SQILReplayBuffer. * Set the expert_buffer in the constructor. * Consistently set expert transition reward to 1 and learner transition reward to 0 when adding them to the SQILReplayBuffer instead of modifying them on-the-fly when sampling. * Fix docstring of SQILReplayBuffer.sample() * Switch back to the CartPole-v1 environment in the SQIL examples * Only train for 1k steps in the SQIL example so the doctests don't run for too long. * Fix cell metadata for tutorial notebook. * Notebook formatting fixes. * Fix typing error in SQIL implementation. * Fix isort issue. * Clarify that our variant of the SQIL implementation is not really "soft". * Fix link in experts documentation. * Remove support for transition mappings. * Remove data_loader from SQIL test cases. * Bump number of demonstrations in SQIL performance test to reduce flakiness. * Adapt hyperparameters in test_sqil_performance to reduce flakiness * Fix seeds for flaky test_sqil_performance * Increase coverage in test_sqil.py * Pass kwargs to SQIL.train to DQN.learn - also set default tb_log_name to "SQIL" * Pass parameters as kwargs for multi-ary methods in sqil.py * Make test for exceptions raised by SQIL constructor more specific - also: adjust imports to conform with style guide --------- Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Maximilian Ernestus <maximilian@ernestus.de> Co-authored-by: Jason Hoelscher-Obermaier <jason.hoelscherobermaier@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See #727
Resubmit of #739 to workaround branch permission issues. Credit to @RedTachyon for this PR