Skip to content

Commit

Permalink
Add MuJoCo Robotics Envs HER+TQC trained agents (#71)
Browse files Browse the repository at this point in the history
* Added HER+TQC robotics benchmarks + update FetchSlide hyperparams to add more time

* add in HER+TQC Fetch env logs

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
  • Loading branch information
sgillen and araffin authored Mar 15, 2021
1 parent ad5f4ec commit bca831b
Show file tree
Hide file tree
Showing 6 changed files with 12,013 additions and 2 deletions.
5 changes: 4 additions & 1 deletion benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ and also allow users to have access to pretrained agents.*
|dqn |RoadRunnerNoFrameskip-v4 | 40396.350| 7069.131|10M | 603257| 137|
|dqn |SeaquestNoFrameskip-v4 | 2000.290| 606.644|10M | 599505| 69|
|dqn |SpaceInvadersNoFrameskip-v4| 622.742| 201.564|10M | 604311| 155|
|her |FetchPickAndPlace-v1 | -8.921| 6.509|1M | 150000| 3000|
|her |FetchPush-v1 | -10.526| 8.916|1M | 150000| 3000|
|her |FetchReach-v1 | -1.677| 1.069|20k | 150000| 3000|
|her |FetchSlide-v1 | -23.162| 10.625|2M | 150000| 3000|
|her |parking-v0 | -6.970| 2.970|200k | 149980| 7106|
|ppo |Acrobot-v1 | -73.506| 18.201|1M | 149979| 2013|
|ppo |AntBulletEnv-v0 | 2865.922| 56.468|2M | 150000| 150|
Expand Down Expand Up @@ -91,7 +95,6 @@ and also allow users to have access to pretrained agents.*
|qrdqn|BeamRiderNoFrameskip-v4 | 17122.941| 10769.997|10M | 596483| 17|
|qrdqn|BreakoutNoFrameskip-v4 | 393.600| 79.828|10M | 579711| 40|
|qrdqn|CartPole-v1 | 500.000| 0.000|50k | 150000| 300|
|qrdqn|EnduroNoFrameskip-v4 | 3231.200| 1311.801|10M | 585728| 5|
|qrdqn|LunarLander-v2 | 70.236| 225.491|100k | 149957| 522|
|qrdqn|MountainCar-v0 | -106.042| 15.536|120k | 149943| 1414|
|qrdqn|PongNoFrameskip-v4 | 20.492| 0.687|10M | 597443| 63|
Expand Down
2 changes: 1 addition & 1 deletion hyperparams/her.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ FetchPush-v1:
FetchSlide-v1:
env_wrapper:
- sb3_contrib.common.wrappers.TimeFeatureWrapper
n_timesteps: !!float 1e6
n_timesteps: !!float 2.5e6
policy: 'MlpPolicy'
model_class: 'tqc'
n_sampled_goal: 4
Expand Down
Loading

0 comments on commit bca831b

Please sign in to comment.