Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib; Offline RL] Store episodes in state form. #47294

Merged

Conversation

simonsays1980
Copy link
Collaborator

@simonsays1980 simonsays1980 commented Aug 23, 2024

Why are these changes needed?

Storing with ray.data episodes as instances results in pickled instances that are maybe not compatible with later python versions. This PR tries to develop a stream that is compatible with later Python versions by simplifying objects in their get_state methods and tries to reproduce them in their static from_state method.
It uses mgspack to serialize objects (and msgpack-numpy to serialize numpy array) which is a serialization protocol independent of Python versions.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

…dified 'SingleAgentEpisode's 'get_state' and 'set_state' to convert to and from lookback buffers.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
…dded '__eq__' method to 'InfiniteLookBackBuffer' together with 'get/set_state' methods. Tested serialization and desrialization with recording offline data and in unit tests.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@simonsays1980 simonsays1980 marked this pull request as ready for review August 26, 2024 16:48
@sven1977 sven1977 changed the title [RLlib; Offline RL]† - Store episodes in state form. [RLlib; Offline RL] Store episodes in state form. Aug 27, 2024
@@ -34,6 +35,77 @@ def __init__(
self.space_struct = None
self.space = space

def __eq__(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumb question: why do we need this method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for testing purposes to make comparisons of instances before writing and after writing.

Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR. Thanks @simonsays1980 . Just one question on eq and a handful of nits.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@simonsays1980 simonsays1980 requested a review from a team as a code owner August 30, 2024 12:35
…ort path for 'PPO' in 'test_usage_stats.py' as it was failing due to missing package 'msgpack_numpy'.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@simonsays1980 simonsays1980 added rllib RLlib related issues rllib-offline-rl Offline RL problems labels Sep 6, 2024
simonsays1980 and others added 2 commits September 9, 2024 16:56
Signed-off-by: sven1977 <svenmika1977@gmail.com>
@sven1977 sven1977 enabled auto-merge (squash) September 9, 2024 16:33
Signed-off-by: sven1977 <svenmika1977@gmail.com>
sven1977 and others added 2 commits September 9, 2024 19:23
Signed-off-by: sven1977 <svenmika1977@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@sven1977 sven1977 enabled auto-merge (squash) September 10, 2024 14:40
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Sep 10, 2024
@sven1977 sven1977 merged commit aa7179a into ray-project:master Sep 10, 2024
6 of 7 checks passed
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests rllib RLlib related issues rllib-offline-rl Offline RL problems
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants