-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Observation in Humanoid/Ant-v2 #1636
Comments
Going to close this in favor of #585 but in this particular case, it looks like the zeros are in
|
Hi,
PS: I think this is actually a rather important issue, as contact forces are part of the reward function and should remain unchanged for combinations of tested mjpy - mujoco versions. |
Agree with @anyboby not sure if this should be closed. |
Thanks @anyboby , your comment really help , |
@johnnylin110 As for the reward functions, given equal states, this code produced the same rewards as mujoco150 + mjpy 200 for me. But again, there is no guarantee and for absolute certainty you would have to use the same versions as a referenced paper. |
@anyboby |
Hi,
Recently I've been working on some experiments using MuJoCo/OpenAI Gym.
And when I was checking the returns from
env.step()
on Humanoid-v2 and Ant-v2.It returns the vector containing most items are zeros so that I have investigated a bit more on the source code like
And did read this issue: #585
but it seems like no one is asking about the issue which I have right now, the actual values in observation from humanoid/ant are dominated by 0.
So that I wonder if anyone gets the obs as me??
=== Info of my env ===
The text was updated successfully, but these errors were encountered: