JORLDY Beta 0.2.0
Pre-release
Pre-release
❗Important
- Atari wrapper is modified with reference to openai baselines wrapper(#92)
- EpisodicLifeEnv, MaxAndSkipEnv, ClipRewardEnv(sign) are applied
- reference: https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py
🛠️ Fixes & Improvements
- Error in Drone Delivery Env Mac build is fixed (#94)
- Mujoco is supported in docker (#96)
- PPO algorithm debugging is done (#103)
- Implement value-clip
- Update log clac to prevent gradient divergence; prob_tensor.log() → Categorical.log_prob()
- Change the advantage standardization order; before value calc → after value calc
- Add custom LR scheduler (DQN, PPO) (#103)
⏰ Known Issues
- ICM PPO and RND PPO performance degrades after ppo is modified. It needs to be fixed
🙏 Acknowledgement
- Thanks to all who contributes JORLDY v0.2.0: @leonard-q , @ramanuzan