You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've used PPO with a similar Go1 environment for another project, but I haven't used it with the go1_go_fast env yet. The code you found is an artifact from the other project. Based on my results from the other environment, I suspect PPO performance would be similar or worse than SAC. Let me know if you'd like me to add PPO functionality; I have the code in another private repo.
Having a PPO implementation as a baseline would be fantastic! Most reinforcement learning algorithms for robots (particularly humanoid robots) are built on Isaac Gym’s PPO framework. When adapting algorithms for a humanoid model in SSRL, the first step is typically to transfer the reward function from Isaac Gym and test with PPO, since its advantages in parallelism and fast convergence. Therefore, including a PPO baseline would be immensely helpful. Thank you for considering this!
Have you compared the SSRL with PPO? I find the following code in your code:
What's the performance between them?
The text was updated successfully, but these errors were encountered: