-
Notifications
You must be signed in to change notification settings - Fork 679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Gymnasium-compliant PPO script #320
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
CI passed. @dtch1997 would you mind running the first round of benchmark? Don't worry about capturing videos yet because of upstream issues. export WANDB_ENTITY=openrlbenchmark
poetry install --with mujoco
OMP_NUM_THREADS=1 xvfb-run -a python -m cleanrl_utils.benchmark \
--env-ids HalfCheetah-v4 Walker2d-v4 Hopper-v4 InvertedPendulum-v4 Humanoid-v4 Pusher-v4 \
--command "poetry run python cleanrl/gymnasium_support/ppo_continuous_action.py --cuda False --track --capture-video" \
--num-seeds 3 \
--workers 1 |
Benchmark in progress: https://wandb.ai/openrlbenchmark/cleanrl?workspace=user-dtch1997 |
Great thank you! |
Executing the following command in https://github.com/vwxyzjn/ppo-atari-metrics
generates
|
Thank you @dtch1997, would you be interested in helping run some
|
Hey @dtch1997, I tried running the |
@nidhishs The |
CI passed, but I had to mark the ubuntu install with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall.
Is there any alternative for video logging of the agent with Gymnasium ?
@dosssman not right now with wandb. Pending wandb/wandb#4510. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks so much @dtch1997!
Description
Types of changes
Checklist:
pre-commit run --all-files
passes (required).mkdocs serve
.If you are adding new algorithm variants or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.
--capture-video
flag toggled on (required).mkdocs serve
.