-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib; Offline RL] Store episodes in state form. #47294
[RLlib; Offline RL] Store episodes in state form. #47294
Conversation
…dified 'SingleAgentEpisode's 'get_state' and 'set_state' to convert to and from lookback buffers. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
…dded '__eq__' method to 'InfiniteLookBackBuffer' together with 'get/set_state' methods. Tested serialization and desrialization with recording offline data and in unit tests. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@@ -34,6 +35,77 @@ def __init__( | |||
self.space_struct = None | |||
self.space = space | |||
|
|||
def __eq__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dumb question: why do we need this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for testing purposes to make comparisons of instances before writing and after writing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR. Thanks @simonsays1980 . Just one question on eq and a handful of nits.
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
…ort path for 'PPO' in 'test_usage_stats.py' as it was failing due to missing package 'msgpack_numpy'. Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Why are these changes needed?
Storing with
ray.data
episodes as instances results in pickled instances that are maybe not compatible with later python versions. This PR tries to develop a stream that is compatible with later Python versions by simplifying objects in theirget_state
methods and tries to reproduce them in their staticfrom_state
method.It uses
mgspack
to serialize objects (andmsgpack-numpy
to serialize numpy array) which is a serialization protocol independent of Python versions.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.