JORLDY Beta 0.2.0

Pre-release

Pre-release

leonard-q released this 27 Jan 05:09

· 100 commits to master since this release

❗Important

Atari wrapper is modified with reference to openai baselines wrapper(#92)
- EpisodicLifeEnv, MaxAndSkipEnv, ClipRewardEnv(sign) are applied
- reference: https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py

🛠️ Fixes & Improvements

Error in Drone Delivery Env Mac build is fixed (#94)
Mujoco is supported in docker (#96)
PPO algorithm debugging is done (#103)
- Implement value-clip
  - reference: https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/ppo2/model.py#L133
- Update log clac to prevent gradient divergence; prob_tensor.log() → Categorical.log_prob()
- Change the advantage standardization order; before value calc → after value calc
- Add custom LR scheduler (DQN, PPO) (#103)

⏰ Known Issues

ICM PPO and RND PPO performance degrades after ppo is modified. It needs to be fixed

🙏 Acknowledgement

Thanks to all who contributes JORLDY v0.2.0: @leonard-q , @ramanuzan

Contributors

ramanuzan and leonard-q

Assets 2