[rllib] Add support for complex observations in `SingleAgentEpisode` #57017

pseudo-rnd-thoughts · 2025-09-29T22:19:39Z

Why are these changes needed?

SingleAgentEpisode.concat would only support numpy array based observations due to np.all(old_episode.observations[-1] == new_episode.observations[0]).
I've changed the implementation to use tree.assert_same_structure and np.all on the flatten structures to verify that observations are equivalent even for complex observation structures.
In addition, I've added a test using a dict obs environment to verify this works.

Related issue number

Closes #54659

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Note

Use structure-aware equality for observations during episode concatenation and add a test with dict observations; minor docstring tweaks.

rllib/env:
- single_agent_episode.py:
  - concat_episode: Replace np.all(a == b) with tree.assert_same_structure and per-leaf np.array_equal to compare complex/nested observations.
  - Add tree import.
  - Minor docstring wording tweaks for len_lookback_buffer.
Tests:
- rllib/env/tests/test_single_agent_episode.py:
  - Add DictTestEnv and test_concat_episode_with_complex_obs to validate concatenation with dict observations.
  - Fix test class name typo.

^{Written by Cursor Bugbot for commit dc4856f. This will update automatically on new commits. Configure here.}

Signed-off-by: Mark Towers <mark@anyscale.com>

gemini-code-assist

Code Review

This pull request aims to add support for complex observation structures in SingleAgentEpisode.concat. The approach of using tree.flatten is a good direction, but the current implementation of the equality check is flawed and will raise a ValueError for nested observations containing numpy arrays. I've provided a critical fix for this issue. Additionally, the new test case to verify this functionality is not as robust as it could be, as it compares an object to itself rather than checking for value equality. I've added a comment with a suggestion to improve the test's reliability.

rllib/env/single_agent_episode.py

gemini-code-assist · 2025-09-29T22:21:10Z

rllib/env/tests/test_single_agent_episode.py

+        assert len(episode_1) == 4
+
+        # cut episode 1 to create episode 2
+        episode_2 = episode_1.cut()


The test for concatenating episodes with complex observations is not as robust as it could be. By creating episode_2 using episode_1.cut(), the overlapping observation (episode_2.observations[0]) is a reference to episode_1.observations[-1], not a deep copy. This means the assertion in concat_episode is testing for object identity rather than value equality.

To make this test stronger and ensure it correctly validates the logic for complex observation structures, consider constructing episode_2 in a way that it holds a deep copy of the overlapping observation. This will properly test the value comparison logic.

Signed-off-by: Mark Towers <mark@anyscale.com>

simonsays1980

LGTM. Thanks for these improvements and the test @pseudo-rnd-thoughts

simonsays1980 · 2025-10-02T09:33:42Z

rllib/env/single_agent_episode.py


        # Make sure, end matches other episode chunk's beginning.
-        assert np.all(other.observations[0] == self.observations[-1])
+        tree.assert_same_structure(other.observations[0], self.observations[-1])


simonsays1980 · 2025-10-02T09:34:06Z

rllib/env/single_agent_episode.py

+        tree.assert_same_structure(other.observations[0], self.observations[-1])
+        # Use tree.map_structure with np.array_equal to check every leaf node are equivalent
+        #   then np.all on flatten to validate all are tree
+        assert np.all(


simonsays1980 · 2025-10-02T09:35:31Z

rllib/env/tests/test_single_agent_episode.py

        # id(episode_1.observations[105]))

+    def test_concat_episode_with_complex_obs(self):
+        """Tests if concatenation of two `SingleAgentEpisode`s works with complex observations (e.g. dict)."""


This is great! The Dict space cases are not fully covered, yet. Thanks for the initiative!

…ay-project#57017) ## Why are these changes needed? `SingleAgentEpisode.concat` would only support numpy array based observations due to `np.all(old_episode.observations[-1] == new_episode.observations[0])`. I've changed the implementation to use `tree.assert_same_structure` and `np.all` on the flatten structures to verify that observations are equivalent even for complex observation structures. In addition, I've added a test using a dict obs environment to verify this works. ## Related issue number Closes ray-project#54659 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [x] This PR is not tested :(  --- > [!NOTE] > Use structure-aware equality for observations during episode concatenation and add a test with dict observations; minor docstring tweaks. > > - **rllib/env**: > - **`single_agent_episode.py`**: > - `concat_episode`: Replace `np.all(a == b)` with `tree.assert_same_structure` and per-leaf `np.array_equal` to compare complex/nested observations. > - Add `tree` import. > - Minor docstring wording tweaks for `len_lookback_buffer`. > - **Tests**: > - **`rllib/env/tests/test_single_agent_episode.py`**: > - Add `DictTestEnv` and `test_concat_episode_with_complex_obs` to validate concatenation with dict observations. > - Fix test class name typo. > > Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit dc4856f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).  --------- Signed-off-by: Mark Towers <mark@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Signed-off-by: xgui <xgui@anyscale.com>

…57017) ## Why are these changes needed? `SingleAgentEpisode.concat` would only support numpy array based observations due to `np.all(old_episode.observations[-1] == new_episode.observations[0])`. I've changed the implementation to use `tree.assert_same_structure` and `np.all` on the flatten structures to verify that observations are equivalent even for complex observation structures. In addition, I've added a test using a dict obs environment to verify this works. ## Related issue number Closes #54659 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [x] This PR is not tested :(  --- > [!NOTE] > Use structure-aware equality for observations during episode concatenation and add a test with dict observations; minor docstring tweaks. > > - **rllib/env**: > - **`single_agent_episode.py`**: > - `concat_episode`: Replace `np.all(a == b)` with `tree.assert_same_structure` and per-leaf `np.array_equal` to compare complex/nested observations. > - Add `tree` import. > - Minor docstring wording tweaks for `len_lookback_buffer`. > - **Tests**: > - **`rllib/env/tests/test_single_agent_episode.py`**: > - Add `DictTestEnv` and `test_concat_episode_with_complex_obs` to validate concatenation with dict observations. > - Fix test class name typo. > > Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit dc4856f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).  --------- Signed-off-by: Mark Towers <mark@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

…ay-project#57017) ## Why are these changes needed? `SingleAgentEpisode.concat` would only support numpy array based observations due to `np.all(old_episode.observations[-1] == new_episode.observations[0])`. I've changed the implementation to use `tree.assert_same_structure` and `np.all` on the flatten structures to verify that observations are equivalent even for complex observation structures. In addition, I've added a test using a dict obs environment to verify this works. ## Related issue number Closes ray-project#54659 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [x] This PR is not tested :(  --- > [!NOTE] > Use structure-aware equality for observations during episode concatenation and add a test with dict observations; minor docstring tweaks. > > - **rllib/env**: > - **`single_agent_episode.py`**: > - `concat_episode`: Replace `np.all(a == b)` with `tree.assert_same_structure` and per-leaf `np.array_equal` to compare complex/nested observations. > - Add `tree` import. > - Minor docstring wording tweaks for `len_lookback_buffer`. > - **Tests**: > - **`rllib/env/tests/test_single_agent_episode.py`**: > - Add `DictTestEnv` and `test_concat_episode_with_complex_obs` to validate concatenation with dict observations. > - Fix test class name typo. > > Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit dc4856f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).  --------- Signed-off-by: Mark Towers <mark@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com>

…ay-project#57017) ## Why are these changes needed? `SingleAgentEpisode.concat` would only support numpy array based observations due to `np.all(old_episode.observations[-1] == new_episode.observations[0])`. I've changed the implementation to use `tree.assert_same_structure` and `np.all` on the flatten structures to verify that observations are equivalent even for complex observation structures. In addition, I've added a test using a dict obs environment to verify this works. ## Related issue number Closes ray-project#54659 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [x] This PR is not tested :(  --- > [!NOTE] > Use structure-aware equality for observations during episode concatenation and add a test with dict observations; minor docstring tweaks. > > - **rllib/env**: > - **`single_agent_episode.py`**: > - `concat_episode`: Replace `np.all(a == b)` with `tree.assert_same_structure` and per-leaf `np.array_equal` to compare complex/nested observations. > - Add `tree` import. > - Minor docstring wording tweaks for `len_lookback_buffer`. > - **Tests**: > - **`rllib/env/tests/test_single_agent_episode.py`**: > - Add `DictTestEnv` and `test_concat_episode_with_complex_obs` to validate concatenation with dict observations. > - Fix test class name typo. > > Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit dc4856f. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).  --------- Signed-off-by: Mark Towers <mark@anyscale.com> Co-authored-by: Mark Towers <mark@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>

[rllib] Add support for complex observations in SingleAgentEpisode

43fbf8d

Signed-off-by: Mark Towers <mark@anyscale.com>

pseudo-rnd-thoughts requested a review from a team as a code owner September 29, 2025 22:19

pseudo-rnd-thoughts added the rllib RLlib related issues label Sep 29, 2025

gemini-code-assist bot reviewed Sep 29, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

pseudo-rnd-thoughts assigned kamil-kaczmarek Sep 29, 2025

pseudo-rnd-thoughts added the rllib-envrunners Issues around the sampling backend of RLlib label Sep 29, 2025

Fix test for normal arrays

dc4856f

Signed-off-by: Mark Towers <mark@anyscale.com>

simonsays1980 approved these changes Oct 2, 2025

View reviewed changes

simonsays1980 added the go add ONLY when ready to merge, run all tests label Oct 2, 2025

kamil-kaczmarek and others added 5 commits October 2, 2025 18:56

Merge branch 'master' into issue-54659

eb21297

Merge branch 'master' into issue-54659

c9caba3

Merge branch 'master' into issue-54659

49745c1

Merge branch 'master' into issue-54659

3916f60

Merge branch 'master' into issue-54659

88f35bd

simonsays1980 merged commit ee8d890 into ray-project:master Oct 21, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rllib] Add support for complex observations in `SingleAgentEpisode` #57017

[rllib] Add support for complex observations in `SingleAgentEpisode` #57017

pseudo-rnd-thoughts commented Sep 29, 2025 •

edited by cursor bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot Sep 29, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

simonsays1980 left a comment

Uh oh!

simonsays1980 Oct 2, 2025

Uh oh!

simonsays1980 Oct 2, 2025

Uh oh!

simonsays1980 Oct 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[rllib] Add support for complex observations in SingleAgentEpisode #57017

[rllib] Add support for complex observations in SingleAgentEpisode #57017

Conversation

pseudo-rnd-thoughts commented Sep 29, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

simonsays1980 left a comment

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[rllib] Add support for complex observations in `SingleAgentEpisode` #57017

[rllib] Add support for complex observations in `SingleAgentEpisode` #57017

pseudo-rnd-thoughts commented Sep 29, 2025 •

edited by cursor bot

Loading