[Feature Request] Support for Truncated Gym API from Gym>= 0.25 #1053

tlpss · 2022-09-07T13:10:11Z

🚀 Feature

Support Gym's new Truncation API from release 0.25 to disambiguate between true terminal states and truncated terminations.

Motivation

In the Bellman Equation, we have to backup with ( reward + value function of the next state ) for all but the terminal states of the MDP, as discussed in the release notes of Gym here and in section 3 of this paper.

However many environments (and hence learning algorithms) do not distinguish between truncations of an infinite-MDP to increase exploration and true terminations, and both are currently passed through the done signal.

To mitigate this, starting from Gym 0.25 the step function returns a terminated and truncated bool, which allows to distinguish between the two cases. This has been found to both increase asymptotic performance and stability with respect to the chosen episode truncation length, both of which seem valid reasons to include it in this repo.

For backward compatibility, one could check the number of return variables in the step function and map the termination to done during rollout collections. I would be willing to assist under the guidance of someone more experienced to help with this.

More information can be found in this issue in the gym repo

The text was updated successfully, but these errors were encountered:

tlpss · 2022-09-07T13:17:49Z

ah, it seems I'm far too late with my request -> #780.

tlpss added the enhancement New feature or request label Sep 7, 2022

tlpss closed this as completed Sep 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support for Truncated Gym API from Gym>= 0.25 #1053

[Feature Request] Support for Truncated Gym API from Gym>= 0.25 #1053

tlpss commented Sep 7, 2022 •

edited

Loading

tlpss commented Sep 7, 2022

[Feature Request] Support for Truncated Gym API from Gym>= 0.25 #1053

[Feature Request] Support for Truncated Gym API from Gym>= 0.25 #1053

Comments

tlpss commented Sep 7, 2022 • edited Loading

🚀 Feature

Motivation

tlpss commented Sep 7, 2022

tlpss commented Sep 7, 2022 •

edited

Loading