Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add curriculum learning example using simple adder #47

Merged
merged 3 commits into from
Aug 29, 2023

Conversation

jillianmclements
Copy link
Collaborator

Description

This example shows how to run curriculum learning to train an RL agent with a custom Gymnasium environment, either on your local machine or on AML. It also shows how to modify a custom Gymnasium simulation environment and RLlib training code to use curriculum learning. Note that RLlib and the literature use the term “task” instead of Bonsai’s “lesson”.

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

  • Tested locally
  • Tested on AML

Checklist:

  • I have squashed my previous commits into one commit and added a meaningful commit message.
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@jillianmclements jillianmclements changed the title Add curriculum learning example for simple adder Add curriculum learning example using simple adder Aug 1, 2023
@juanvergaramunoz
Copy link
Collaborator

juanvergaramunoz commented Aug 5, 2023

Hi @jillianmclements. Great job with the sample. A couple comments while I continue reviewing the PR:

  • I see we have tensorflow_probability (==0.19.0) in conda.yml. It seems RRLIB requires in ray\rllib\algorithms\bandit\bandit_tf_model.py. Do we know if this is a bug? Or is tf_prob something that PYTORCH doesn't have?
  • I see we also have plato@main in conda.yml. Can we remove this dependency?
  • We use tune.Tuner in this sample, but we use PPOConfig in getting-started-with-aml. Is this change neede as an introduction of callbacks?

@jillianmclements
Copy link
Collaborator Author

Hi @jillianmclements. Great job with the sample. A couple comments while I continue reviewing the PR:

  • I see we have tensorflow_probability (==0.19.0) in conda.yml. It seems RRLIB requires in ray\rllib\algorithms\bandit\bandit_tf_model.py. Do we know if this is a bug? Or is tf_prob something that PYTORCH doesn't have?
  • I see we also have plato@main in conda.yml. Can we remove this dependency?
  • We use tune.Tuner in this sample, but we use PPOConfig in getting-started-with-aml. Is this change neede as an introduction of callbacks?

Thanks, Juan! For reference, this is an adaptation of https://github.com/ray-project/ray/blob/master/rllib/examples/curriculum_learning.py.

  1. "Do we know if this is a bug?" - yes, it's an RLlib bug. It's fixed now, but not in Ray 2.5.0. [RLLib] Importing bandits without tensorflow installed causes an import error ray-project/ray#31327
  2. I will remove it.
  3. PPOConfig doesn't seem to support env_task_fn, which is required for curriculum learning.

@juanvergaramunoz
Copy link
Collaborator

@ucabmir - PR is now ready for user review.
Note, RAY training execution code in main.py was aligned with the workflow shown in hyperparameter-tuning-and-monitoring.

Copy link
Contributor

@jazmiahenry jazmiahenry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error when running locally on M1 Mac: "zsh: illegal hardware instruction python main.py --test-local".

Investigation may need to go into whether this is M1 Mac specific, unique only to the reviewer or also affects Windows.

Runs well on AML.

@jazmiahenry jazmiahenry merged commit 2702410 into main Aug 29, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants