We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
old_log_probs 和 log_probs 是一直相等的吗?同样的actor参数,同样的输入.....
The text was updated successfully, but these errors were encountered:
我觉得应该是相等的。在这里我也有相同的疑惑。 你可以看看“蘑菇书EasyRL”里面介绍的ppo算法,那个算法中得到old_log_probs之后紧接着是一个关于echo的for循环,在for循环里面会计算log_probs然后更新self.actor神经网络。
Sorry, something went wrong.
No branches or pull requests
old_log_probs 和 log_probs 是一直相等的吗?同样的actor参数,同样的输入.....
The text was updated successfully, but these errors were encountered: