Why is success not a criterion in the reward function? #508
-
I'm trying to understand why the authors didn't include the success criterion in the reward function. Consider the following scenario: I trigger termination as soon as success is reached. In this case, every agent would be incentivised not to reach the success point, since doing so would reduce the overall reward. This makes sense - if the reward is higher if the agent stays close to the success point without entering it (to avoid termination), agents will prefer to stay close rather than complete the task. Does anyone have a good suggestion for how I can make reaching the success point more rewarding in all 50 environments without having to change each one manually? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
It is included in the reward function, for certain environments. An example is in assembly |
Beta Was this translation helpful? Give feedback.
It is included in the reward function, for certain environments. An example is in assembly