You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the purpose of explainability, is there a way to look at the expected rewards of each timestep of the testing set? I believe that the agent (lets say SAC) is taking an action based on the expected reward. It will be nice to see that underlying data. Please let me know if there is any way to do that
The text was updated successfully, but these errors were encountered:
Yes. But there are probably many actions that the agent can take in every state? Why does it take a specific action? Lets say that we use a deterministic policy. The agent is probably looking at the set of all available actions and choosing the one where it expects reward to be the highest. So if I can get access to the underlying expected rewards for every state-action pair (Q table), I can understand the actions of the agent better ? Thanks a lot for replying
For the purpose of explainability, is there a way to look at the expected rewards of each timestep of the testing set? I believe that the agent (lets say SAC) is taking an action based on the expected reward. It will be nice to see that underlying data. Please let me know if there is any way to do that
The text was updated successfully, but these errors were encountered: