Skip to content

Transfer Algorithms to RLFarm #1028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
8 tasks
jeremiahpslewis opened this issue Mar 6, 2024 · 17 comments
Closed
8 tasks

Transfer Algorithms to RLFarm #1028

jeremiahpslewis opened this issue Mar 6, 2024 · 17 comments

Comments

@jeremiahpslewis
Copy link
Member

jeremiahpslewis commented Mar 6, 2024

Description

Much of RLZoo needs to be migrated to the latest RLCore.jl / Flux.jl syntax.

The goal is as follows:

  1. Upgrade / refactor code in RLZoo/src/algorithims
    • Such that it fulfills:
      • Uses latest Flux.jl syntax
      • Seamlessly supports GPU where sensible
      • Has unit tests
  2. Add to new library ReinforcementLearningFarm

I think a good approach would be to take this folder by folder, e.g. cfr, dqns, etc. and where possible reactivate the corresponding experiment in RLExperiments. One file / algorithm = one pull request to keep things manageable / reviewable.

It probably makes sense to start with folders /files which are not commented out, before moving onto those which are currently commented out like policy_gradient, where the code is less well maintained and will require more work.

Status

  • cfr
  • dqns
  • exploitability_descent
  • nfsp
  • offline_rl
  • policy_gradient
  • searching
  • tabular
@jeremiahpslewis
Copy link
Member Author

@joelreymont Would love your help!

@jeremiahpslewis jeremiahpslewis pinned this issue Mar 6, 2024
@joelreymont
Copy link
Contributor

It will be done! 🫡

@joelreymont
Copy link
Contributor

joelreymont commented Mar 7, 2024

By reactivating do you mean uncommenting these in src/algorithms/algorithms.jl?

Can you point me to a description or examples of the latest RLCore.jl / Flux.jl syntax?

@joelreymont
Copy link
Contributor

joelreymont commented Mar 7, 2024

The experiments have been deleted. Is it save to restore and comment them out?

@joelreymont
Copy link
Contributor

Is there an example of seamless GPU support in RLZoo?

@jeremiahpslewis
Copy link
Member Author

Yep, and going through the ones which are currently uncommented and adding tests, updating them.

The policies, learners and explorers shipped in RLCore are a good starting place for the latest syntax: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/main/src/ReinforcementLearningCore/src/policies/q_based_policy.jl

As for Flux / gpu, this is a good example: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/main/src/ReinforcementLearningZoo/src/algorithms/dqns/basic_dqn.jl

In general, you can use the gpu function from Flux.jl to pass objects to the gpu, when available, if no gpu is active, then it just passes the object unchanged. Most of the current implementations may use a gpu, but don't necessarily benefit from it, perhaps we could use https://chairmarks.lilithhafner.com/v1.1.0/ to check performance...

The experiments I'm referring to are ones here: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/main/src/ReinforcementLearningExperiments/test/runtests.jl

@joelreymont
Copy link
Contributor

Are the experiments auto-generated? I don't see them in the repo.

@jeremiahpslewis
Copy link
Member Author

oh, sorry, it's super confusing and I have absolutely no idea why its setup this way (maybe something to do with dependencies), but the experiments are here: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/tree/main/src/ReinforcementLearningExperiments/deps/experiments

maybe we should move the experiments to the tests folder or otherwise make them more accessible

@joelreymont
Copy link
Contributor

joelreymont commented Mar 7, 2024

This is a rather large elephant for me to eat all at once so I'm gonna try small chunks and lots of questions!

Also, I need to learn about elephants, e.g. Flux, RL and reinforcement learning.

@HenriDeh
Copy link
Member

HenriDeh commented Mar 7, 2024

oh, sorry, it's super confusing and I have absolutely no idea why its setup this way (maybe something to do with dependencies), but the experiments are here: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/tree/main/src/ReinforcementLearningExperiments/deps/experiments

Experiments are set up in this way because they used to automatically generate plots in the package's documentation. These files are not actually run, they are used to generate some more source code when the package is built. Additionally, the experiments are called with macros, I think it is meant to mimic the naming of experiments in another python RL package. So yeah, RL.jl is already a fairly complex code source--too complex in my opinion--but RLExperiments are wrapped with two additional layers of complexity.

@joelreymont
Copy link
Contributor

joelreymont commented Mar 11, 2024

I decided to start with CFR so I'm going through the RL blog posts and docs and will read the CFR paper when done.

In the meantime, I've been comparing the model q_based_policy.jl code with the CFR algorithm implementation. This is way over my head at the moment so I'm gonna ask very basic questions...

Are there more precise examples that would help me learn the difference between the old RL Core and/or Flux syntax and the new one?

@HenriDeh
Copy link
Member

This page may help you
https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/main/docs/src/How_to_implement_a_new_algorithm.md

@jeremiahpslewis jeremiahpslewis unpinned this issue Mar 11, 2024
@joelreymont
Copy link
Contributor

I think I understand what has to be done. Working on it...

@jeremiahpslewis
Copy link
Member Author

Just a note, algorithms that you work on / fix up will end up in RLFarm, not in RLZoo.

@jeremiahpslewis jeremiahpslewis changed the title Reactivating RLZoo Transfer Algorithms to RLFarm Mar 12, 2024
@joelreymont
Copy link
Contributor

And what's gonna happen with RLZoo? Is it going to be left as is and removed eventually?

@jeremiahpslewis
Copy link
Member Author

jeremiahpslewis commented Mar 12, 2024

It will be kept in RL.jl as an archive for a certain period of time (3 months? 6 months?) transferred to a separate repository as a cold archive. The same for DistributedRL and RLExperiments (where a new replacement package will be spun up, separate from this repo)

Right now the issue is that if you make a change to RLCore, it doesn't pass tests, doesn't pass definition of done etc. unless RLZoo and RLExperiments have been brought up to speed. Given the state of the code in Zoo / Experiments, this makes it impossible to move RLCore forward.

@jeremiahpslewis jeremiahpslewis closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants