Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

20章多智能体入门程序疑问 #97

Open
lz200202 opened this issue Nov 17, 2024 · 1 comment
Open

20章多智能体入门程序疑问 #97

lz200202 opened this issue Nov 17, 2024 · 1 comment

Comments

@lz200202
Copy link

image
old_log_probs 和 log_probs 是一直相等的吗?同样的actor参数,同样的输入.....

@ZisongXu
Copy link

我觉得应该是相等的。在这里我也有相同的疑惑。
你可以看看“蘑菇书EasyRL”里面介绍的ppo算法,那个算法中得到old_log_probs之后紧接着是一个关于echo的for循环,在for循环里面会计算log_probs然后更新self.actor神经网络。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants