Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help understanding how to read the code #30

Open
ryanmaxwell96 opened this issue Apr 1, 2020 · 6 comments
Open

Help understanding how to read the code #30

ryanmaxwell96 opened this issue Apr 1, 2020 · 6 comments

Comments

@ryanmaxwell96
Copy link

Hello,

Just a quick question. In policy.py in class Policy it uses the Keras package to call "get_layer". This is the output layer correct? Also, I sent an email out so feel free to ignore this part if you already answered it, but I see from the TRPO paper that the NN is supposed to only calculate the mean and somehow uses another set of parameters which are a vector of the same size as the number of actions. But the paper is not clear to me how stdev is actually computed or updated. And in this code, all of it is computed under the hood in Keras.

Anyhelp on this would be greatly appreciated!

Ryan

@ryanmaxwell96
Copy link
Author

Also, can you help me understand why there are multiple neural network outputs? I tested the halfcheetah code and found out that the observation is a (1,27) but for some reason a (1,6) vector of means is returned and I'm at a loss as to why there are 6 means being returned.

@ryanmaxwell96
Copy link
Author

ryanmaxwell96 commented Apr 4, 2020

Unless it refers to the 6 half-cheetah joints that can move depending on the state (of 27 dimensions). So depending on which of these states it is in, the policy will tell it what position each of these joints should be in via means and log vars.

@pat-coady
Copy link
Owner

pat-coady commented Apr 5, 2020 via email

@ryanmaxwell96
Copy link
Author

Ok thank you. Sorry I have another question. Where is the name "policy_nn" coming from? I'm guessing it is the last layer, correct?

@ryanmaxwell96
Copy link
Author

ryanmaxwell96 commented Apr 14, 2020

Can you please explain to me why in value.py it has an output Dense layer of size 1? Shouldn't it be the same size as the action dimension?

@ryanmaxwell96
Copy link
Author

Also, how do you use plotting.py? I don't see it currently being used in any of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants