Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Custom environment tutorial #919

Closed
1 task done
viktor-ktorvi opened this issue Feb 16, 2023 · 6 comments
Closed
1 task done

[Feature Request] Custom environment tutorial #919

viktor-ktorvi opened this issue Feb 16, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@viktor-ktorvi
Copy link

viktor-ktorvi commented Feb 16, 2023

Motivation

Hey, I've had a rough time snooping through the source code to figure out how to make a custom environment using EnvBase so I've made an early draft of a tutorial based on my experience :). It's a simple control problem with a linear system. I'm not sure if this is the way to submit this or if I should make a pull request somehow so please do tell me about that.

Solution

Here's the jupyter notebook. I think it's realively short and understandbale and I'd appreciate the feedback i.e. is it actually implemented in the way it was intended to(I've managed to train ppo on it so it works but it might not be up to the best practices).

Alternatives

/

Additional context

/

Checklist

  • I have checked that there is no similar issue in the repo (required)
@viktor-ktorvi viktor-ktorvi added the enhancement New feature or request label Feb 16, 2023
@vmoens
Copy link
Contributor

vmoens commented Feb 16, 2023

Thanks for this! Wonderful!
I like that it's short and to the point.

Do you think we should make a tutorial out of it, or an example?

Can you have a look at this:
#911

It's a tutorial to code the pendulum, sort of the simplest thing i could think of.
LMK what you think.

@viktor-ktorvi
Copy link
Author

I guess an example is shorter than a tutorial so this could kind of go along with the creating the custom env tutorial since you mention stateless and stateful environments this could be an example of a stateful one. And having a short and to the point example may benefit some users who just want answers quickly. I'd be happy to make any adjustments in that regard.

As for the pendulum tutorial, I think it's gonna be good, I like having everything explained in detail. I'm just reading it as code in the .py file; do you have a link to the .html? Because I can't really get a good feel for if it's easy to read or not, right now it's hard.

@vmoens
Copy link
Contributor

vmoens commented Feb 16, 2023

Yep it needs some polishing but here it is
https://pytorch.org/rl/tutorials/pendulum.html#training-loop

@viktor-ktorvi
Copy link
Author

Comments:

  • As for the stateless environment - I got that it had some flexibility benefits; I didn't quite get the details of how it helps batched execution
  • _step great; I'd maybe add the equation in latex just for aesthetics
  • In specs:
    • when mentioning batch size - does an environment have a non-empty batch size when there are multiple parallel environments? Did I get that correctly?
    • The part where the previous point plays into the reward shape is also not clear - so the reward shape should be (env_batch_size, actual_reward_size)?
    • It's not clear what the difference between observation specs and input specs is.
    • Why is the shape of observation specs an empty tuple? I would expect it to be 1

The rest is simple and understandable :)

@svnv-svsv-jm
Copy link

One year and this is not a thing yet? :D Just copy-paste that notebook in the official repo. How can that hurt?

@vmoens
Copy link
Contributor

vmoens commented Jan 31, 2024

We do have a tutorial now, have you had a chance to check it out? https://pytorch.org/rl/tutorials/pendulum.html

@vmoens vmoens closed this as completed Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants