Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for reward scaling wrappers #1285

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

keraJLi
Copy link

@keraJLi keraJLi commented Jan 2, 2025

Description

Changes the documentation of reward scaling wrappers. It mainly removes incorrect or unsubstantiated information.
Affected wrappers are wrappers/stateful_reward.py and wrappers/vector/stateful_reward.py.

Fixes #1272

Type of change

Please delete options that are not relevant.

  • Documentation only change (no code changed)

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @keraJLi, to clarify what do you mean by their exponential moving average?
To me, this isn't clear what the expected mean is or what exactly the rewards are normalised by?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal/Question] Incorrect documentation of NormalizeReward wrapper
2 participants