Skip to content

Releases: kakaoenterprise/JORLDY

JORLDY Beta 0.5.0

18 Apr 02:00
652ff18
Compare
Choose a tag to compare
JORLDY Beta 0.5.0 Pre-release
Pre-release

❗Important

  • JORLDY ArXiv Paper is published! (link)
  • Algorithm description is added! (#168) (link)

🛠️ Fixes & Improvements

  • PPO continuous debugging is done (#157)
  • Initialize actors network as a learner network (#165)

🔩 Minor fix

  • Modify to reset rollout buffer stamp to 0 (#165)

⏰ Known Issues

  • R2D2 need to be optimized
  • IQN based algorithms debugging should be done
  • VMPO performance is unstable (#164)

🙏 Acknowledgement

JORLDY Beta 0.4.0

04 Apr 01:18
ae40d36
Compare
Choose a tag to compare
JORLDY Beta 0.4.0 Pre-release
Pre-release

🛠️ Fixes & Improvements

  • Update Pytorch version to 1.10 and other packages (#139)
  • ICM and RND debugging is done (#145)
  • APE-X debugging is done (#147)
  • SAC discrete implemented (#150)

🔩 Minor fix

  • Update Readme (contributors) (#138)
  • Update distributed architecture flowchart and timeline (#143)
  • Learning rate decay can be set as optional (#151)
  • Split optimizer of ICM and RND from PPO (#152)
  • modify calculating async step (#154)

⏰ Known Issues

  • R2D2 need to be optimized
  • IQN based algorithms have to be evaluated

🙏 Acknowledgement

JORLDY Beta 0.3.0

10 Mar 09:01
b1b7828
Compare
Choose a tag to compare
JORLDY Beta 0.3.0 Pre-release
Pre-release

❗Important

  • Integrate scripts into one main script (#125)
  • TD3 is implemented (#127)
  • R2D2 is implemented, but it needs to be optimized (#104)

🛠️ Fixes & Improvements

  • Edit stamp step calc; reset to 0 → -= period step(#130)
  • implement gather thread to process get from queue with thread(update manage process with it)(#130)
  • Intergrate dqn network, deterministic policy actor, critic (#129)
  • Add lr scheduler to all RL algorithms (#108)

🔩 Minor fix

  • Delete unused variable in ddqn (#128)

⏰ Known Issues

  • ICM PPO and RND PPO performance degrades after ppo is modified. It needs to be fixed
  • R2D2 need to be optimized
  • APE-X debugging has to be done
  • IQN based algorithms have to be evaluated

🙏 Acknowledgement

JORLDY Beta 0.2.0

27 Jan 05:09
2411b77
Compare
Choose a tag to compare
JORLDY Beta 0.2.0 Pre-release
Pre-release

❗Important

🛠️ Fixes & Improvements

⏰ Known Issues

  • ICM PPO and RND PPO performance degrades after ppo is modified. It needs to be fixed

🙏 Acknowledgement

JORLDY Beta 0.1.0

23 Dec 12:28
5335c13
Compare
Choose a tag to compare
JORLDY Beta 0.1.0 Pre-release
Pre-release

❗Important
- Unit test codes are implemented!
- M-DQN, M-IQN are implemented! (#79)
- Mujoco envs are supported! (#83)

🛠️Fixes & Improvements
- RND code refactoring (#52) occurs fatal error → It is solved with changing parameter name of RND (#71)
- Change default initialization method (Xavier → Orthogonal) (#81)
- Change Softmax to exp(log_softmax) (#82)
- Unit test for Mujoco env is done (#93)

🙏Acknowledgement
- Thanks to all who contributes JORLDY v0.1.0: @leonard-q @ramanuzan @lkm2835

JORLDY Beta 0.0.3

23 Nov 06:11
a908e29
Compare
Choose a tag to compare
JORLDY Beta 0.0.3 Pre-release
Pre-release
  • Important
    • Github action is applied for Python code style (PEP8). Please refer to style guide of CONTRIBUTING.md
    • New environment: Drone Delivery ML-Agents Environment is added! 🛸
    • ML-Agents Server builds are removed! Linux build with no_graphics option can be run on the Server. (#58)
  • Fixes & Improvements
    • JORLDY supports envs which provides multi modal input (image, vector)
    • mlagents Windows issue
      • Issue #44 was occurred when mlagents envs were run in Windows
      • #46 solved this problem (Thank you so much @zenoengine )
    • mlagents Linux build Issue
      • mlagents envs had error, because .gitignore contains *.so. It removes all the .so files in mlagents envs. Therefore, all the .so files are restored and .gitignore is modified.
    • ICM, RND code refactoring is conducted because of the duplicated functions (#52)
    • ICM PPO bug fix: remove softmax before calc cross-entropy (#49)
    • *_timers.json files in mlagent envs caused conflict when using git, *_timers.json files are added to .gitignore (#59)
    • Benchmark is developed! → config, script, spec are added
  • Acknowledgement

JORLDY Beta 0.0.2

06 Nov 02:14
1d85e72
Compare
Choose a tag to compare
JORLDY Beta 0.0.2 Pre-release
Pre-release

📢 Important

  • Now JORLDY fully supports Windows, Mac and Linux!

🛠️ Fixes & Improvements

  • README minor fix
    • Remove $, >
    • fixed typos
  • modify gitignore; add python gitignore template
  • supports WSL, Windows and Mac
    • change agent instantiation code #28
    • custom dict can be pickled
    • multiprocessing qsize() → empty, full
  • remove _nomp.py files
    • solve multiprocessing issue on all OS

🙏 Acknowledgement

JORLDY Beta 0.0.1

03 Nov 08:42
ec5b910
Compare
Choose a tag to compare
JORLDY Beta 0.0.1 Pre-release
Pre-release

Hello WoRLd! ✋ This is first version of JORLDY, which is open-source Reinforcement Learning (RL) framework provided by KakaoEnterprise! We expect that JORLDY helps researchers and students who study RL. The features of JORLDY are as follows ⭐.

  • 20+ RL Algorithms and various RL environment are provided
  • Algorithms and environment can be added and customized
  • The running of RL algorithm and environment is conducted using single command
  • Distributed RL algorithms are provided using ray
  • Benchmark of the algorithms is conducted in many RL environment

🤖 The implemented algorithms are as follows:

  • Deep Q Network (DQN), Double DQN, Dueling DQN, Multistep DQN, Prioritized Experience Replay (PER), C51, Noisy Network, Rainbow (DQN, IQN), QR-DQN, IQN, Curiosity Driven Exploration (ICM), Random Network Distillation (RND), APE-X, REINFORCE, DDPG, PPO, SAC, MPO, V-MPO

🌎 The provided environments are as follows

  • GYM classic control, Unity ML-Agents, Procgen,
    • GYM Atari and Super Mario Bros are excluded from the requirement because of the license issue. You should install these environments manually.