Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify CleanRL is a non-modular library #200

Merged
merged 4 commits into from
Jun 18, 2022
Merged

Conversation

vwxyzjn
Copy link
Owner

@vwxyzjn vwxyzjn commented Jun 10, 2022

Description

Closes #197.

Types of changes

  • Documentation

@vwxyzjn vwxyzjn requested a review from dosssman June 10, 2022 23:05
@vercel
Copy link

vercel bot commented Jun 10, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
cleanrl ✅ Ready (Inspect) Visit Preview Jun 17, 2022 at 4:33PM (UTC)

Copy link
Contributor

@cool-RR cool-RR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Personally I'd put the phrase "reference implementation" into the title of the readme and the "about" of the repo (where it says "High-quality single file implementation".) But it's your decision.

docs/index.md Outdated
Good luck have fun 🚀
Good luck have fun :rocket:

⚠️ **NOTE**: CleanRL is *not* a modular library and therefore it is not meant to be imported. At the cost of duplicate code, we make all implementation details of a DRL algorithm variant easy to understand, so CleanRL comes with its own pros and cons. You should consider using CleanRL if you want to 1) understand all implementation details of an algorithm's varaint or 2) do quick prototypes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

varaint -> variant

I'm not sure that "do quick prototypes" makes sense here. Running from clearrl import PPO would be quick. Reading the algorithm and copy-pasting it into my code is slow.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "doing prototypes" is not really a well-defined notion as there are many types of prototypes. Being able to do prototypes quickly largely depends on the use case.

While things like from stable_baselines3 import PPO is quick but if you want to prototype advanced features that SB3 does not support, it could be more difficult as discussed in #197 with the invalid action masking example.

Maybe I can clarify as "do prototypes that can't be achieved by just combining components in modular DRL libraries"? I am really unsure what the phrasing would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if you want to prototype advanced features that SB3 does not support" Keep in mind that 95% of people just want something that works, not advanced features. But in any case, this PR is good and I think you should make any changes that result in it being merged.

@cool-RR
Copy link
Contributor

cool-RR commented Jun 17, 2022

Lost interest?

@vwxyzjn
Copy link
Owner Author

vwxyzjn commented Jun 17, 2022

Hey sorry for the delay. This week is my end-of-internship and a move-out week.

Yes, let me think of a place to put the reference implementation. Please let me know what you think of my reply above regarding the prototypes comment.

README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CleanRL can't be used by importing?
2 participants