Releases
v0.4.9
Bug Fix
Fix save_checkpoint_fn return value to checkpoint_path (#659 , @Trinkle23897 )
Fix an off-by-one bug in trainer iterator (#659 , @Trinkle23897 )
Fix a bug in Discrete SAC evaluation; default to deterministic mode (#657 , @nuance1979 )
Fix a bug in trainer about test reward not logged because self.env_step is not set for offline setting (#660 , @nuance1979 )
Fix exception with watching pistonball environments (#663 , @ycheng517 )
Use env.np_random.integers
instead of env.np_random.randint
in Atari examples (#613 , @ycheng517 )
API Change
Upgrade gym to >=0.23.1
, support seed
and return_info
arguments for reset (#613 , @ycheng517 )
New Features
Add BranchDQN for large discrete action spaces (#618 , @BFAnas )
Add show_progress option for trainer (#641 , @michalgregor )
Added support for clipping to DQNPolicy (#642 , @michalgregor )
Implement TD3+BC for offline RL (#660 , @nuance1979 )
Add multiDiscrete to discrete gym action space wrapper (#664 , @BFAnas )
Enhancement
Use envpool in vizdoom example (#634 , @Trinkle23897 )
Add Atari (discrete) SAC examples (#657 , @nuance1979 )
You can’t perform that action at this time.