[bug-fix] Fix stats reporting for reward signals in SAC #3606

ervteng · 2020-03-10T23:09:30Z

Proposed change(s)

When the Optimizer was split from the Policy, SAC was still using the Policy's reward signals to write the reward values to Tensorboard. This meant that the Policy's reward stats (e.g. Extrinsic Reward, GAIL Reward) was always 0.

This PR fixes the bug and removes the unnecessary reward_signals dict in Policy.

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

andrewcoh

LGTM

Ervin Teng added 2 commits March 10, 2020 16:00

Remove reward signals from Policy in SAC

0c561b7

Remove debug, add test

9c4f979

ervteng requested a review from andrewcoh March 10, 2020 23:09

andrewcoh approved these changes Mar 11, 2020

View reviewed changes

ervteng merged commit d25348c into master Mar 11, 2020

delete-merged-branch bot deleted the develop-fixsacrewardreporting branch March 11, 2020 17:55

github-actions bot locked as resolved and limited conversation to collaborators May 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug-fix] Fix stats reporting for reward signals in SAC #3606

[bug-fix] Fix stats reporting for reward signals in SAC #3606

ervteng commented Mar 10, 2020

andrewcoh left a comment

[bug-fix] Fix stats reporting for reward signals in SAC #3606

[bug-fix] Fix stats reporting for reward signals in SAC #3606

Conversation

ervteng commented Mar 10, 2020

Proposed change(s)

Types of change(s)

Checklist

Other comments

andrewcoh left a comment

Choose a reason for hiding this comment