-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
len(tmp_observations) < 2 on PPO raise ValueError: The parameter probs has invalid values #26
Comments
Could you please provide a minimal reproducible example and full stack trace, other log outputs, etc? |
Here it is a minimal example to reproduce the error: testPPO.txt. Just by setting max_steps = 0 in the provided example, in this case you will have len( tmp_observations ) == 1 and will raise the following error: Traceback (most recent call last): Process finished with exit code 1 |
Sorry for the late comment, that's definitely a boundary case that needed to be fixed. I'm working on a paper recently and my computer also got hacked so this problem might hang around here a little longer, I'm really sorry for the inconvenience. |
It seems that your code produce error if the len of your trajectory < 2 ( len(tmp_observations) < 2). I tested this on PPO I don't know if this happens with all algorithms.
The error:
ValueError: The parameter probs has invalid values
The text was updated successfully, but these errors were encountered: