-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance #3
Comments
Hello Hamza, Thanks for showing interest in my code! The reward function is quite simple: the As for your question why I used orientation based rewards: I did that, because that has not been done in literature (at least at that time), and because I thought it was a suitable reward system to stimulate flock behaviour, without explicitly rewarding it, or implementing it in the system (as the Vicsek model does). You can read more about the model in my thesis, which is publicly available herre: https://studenttheses.universiteitleiden.nl/access/item%3A2711425/view. Best, André |
I am trying to replicate your reward function for an open ended environment where I vary acceleration. 1. As I understand it you are just using alignment right? 2. Do you terminate an episode and if so with what conditions? 3. How do you ensure sufficient separation? Thanks for the help. |
Hello Andre,
I was wondering if you had a mathematical equation available for the reward function or could point it out to me in the code file. Much appreciated.
Also why did you only use only orientation based rewards (as mentioned in the repo?
Regards,
Hamza
The text was updated successfully, but these errors were encountered: