[RLlib; Offline RL] Store episodes in state form. #47294

simonsays1980 · 2024-08-23T06:59:45Z

Why are these changes needed?

Storing with ray.data episodes as instances results in pickled instances that are maybe not compatible with later python versions. This PR tries to develop a stream that is compatible with later Python versions by simplifying objects in their get_state methods and tries to reproduce them in their static from_state method.
It uses mgspack to serialize objects (and msgpack-numpy to serialize numpy array) which is a serialization protocol independent of Python versions.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…dified 'SingleAgentEpisode's 'get_state' and 'set_state' to convert to and from lookback buffers. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

…dded '__eq__' method to 'InfiniteLookBackBuffer' together with 'get/set_state' methods. Tested serialization and desrialization with recording offline data and in unit tests. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

rllib/env/single_agent_episode.py

sven1977 · 2024-08-27T09:40:16Z

rllib/env/utils/infinite_lookback_buffer.py

@@ -34,6 +35,77 @@ def __init__(
        self.space_struct = None
        self.space = space

+    def __eq__(


dumb question: why do we need this method?

Just for testing purposes to make comparisons of instances before writing and after writing.

sven1977

Great PR. Thanks @simonsays1980 . Just one question on eq and a handful of nits.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

…ort path for 'PPO' in 'test_usage_stats.py' as it was failing due to missing package 'msgpack_numpy'. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>

simonsays1980 added 4 commits August 23, 2024 08:54

Added 'get_state' and 'from_state' to 'InfiniteLookbackBuffer' and mo…

7f7decb

…dified 'SingleAgentEpisode's 'get_state' and 'set_state' to convert to and from lookback buffers. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

Merge branch 'master' into offline-store-episodes-in-state-form

7c5f67a

Added 'msgpack' and 'msgpack_numpy' to the 'rllib-test-requirements.txt.

37d37a0

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

simonsays1980 marked this pull request as ready for review August 26, 2024 16:48

simonsays1980 requested review from sven1977 and ArturNiederfahrenhorst as code owners August 26, 2024 16:48

Merge branch 'master' into offline-store-episodes-in-state-form

13e4d71

sven1977 changed the title ~~[RLlib; Offline RL]† - Store episodes in state form.~~ [RLlib; Offline RL] Store episodes in state form. Aug 27, 2024