You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read your TRPO paper and I'm trying to reproduce the Fisher-Vector Product calculation function in C. Line 36-37 in agentzoo.py make me confused. I copy the weights to my code, feed ob_no into the network, and check its outputs against prob_np. It turned out that the mean values in prob_np are the original neural network outputs that are not multiplied by 0.1. (I use theano backend, swimmer-v1 test case, 8-64-64-2 network.) Also the *0.1 thing is not mentioned in the TRPO paper. I was wondering whether you can shed some light on this issue.
Oh yeah, that's a known bug in my code. I haven't looked at this code in a while. Are you sure it's not mentioned? I thought we said something about that.
Hi John,
Thanks for your quick reply! I read the latest version (v4) of TRPO paper downloaded form arxiv from the beginning to the end, which doesn't seem to mention that...
Patrick
I'm interested in the reasoning for this .1 multiplier as well. It seems like it never was added to the draft? If it was would you mind pointing me to it or explaining the reasoning for this here?
Hi John,
I have read your TRPO paper and I'm trying to reproduce the Fisher-Vector Product calculation function in C. Line 36-37 in agentzoo.py make me confused. I copy the weights to my code, feed ob_no into the network, and check its outputs against prob_np. It turned out that the mean values in prob_np are the original neural network outputs that are not multiplied by 0.1. (I use theano backend, swimmer-v1 test case, 8-64-64-2 network.) Also the *0.1 thing is not mentioned in the TRPO paper. I was wondering whether you can shed some light on this issue.
Thank you in advance!
thanks
Patrick
The text was updated successfully, but these errors were encountered: