Initialize-from custom checkpoints #5525

maryamhonari · 2021-09-01T19:32:17Z

Proposed change(s)

Initialize from custom checkpoints

Usage:

behaviors:
  BigWallJump:
    Init_path: BigWallJump-654098.pt #specify a checkpoint_name in the behavior directory
    trainer_type: ppo
  MediumWallJump:
    Init_path: results/previous_run/MediumWallJump/MediumWallJump-654098.pt # full path is also supported
    trainer_type: ppo
SmallWallJump: # if none specified, will initialize from most recent checkpoint
    trainer_type: ppo
checkpoint_settings:
  initialize_from: previous_run

TODO: Extra tests, Documentation, changeLog

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

maryamhonari · 2021-09-08T19:01:41Z

ml-agents/mlagents/trainers/trainer/trainer_factory.py

-        if init_path is not None:
-            trainer_settings.init_path = os.path.join(init_path, brain_name)


Not sure why init_path is passed up to here, was there a specific reason to set this in trainer factory?
I moved this logic to learn.py and if it's harmless will remove the init_path attribute.

Have you tried using init_path and see if it does what we expect out of it?
Reading a bit of the code in torch_model_saver.py it looks like setting init_path in the trainer settings initializes the policy from a checkpoint. Is that correct?

Correct. Previously trainer_settings.init_path="result_dir/run_id/brain_name and we added the checkpoint name at the end in torch_model_saver.py
This PR sets the full path trainer_settings.init_path="result_dir/run_id/brain_name/checkpoint_name.pt in learn.py:102

ml-agents/mlagents/trainers/model_saver/torch_model_saver.py

vincentpierre

I made some comments.
I would like some clarification on what init_path in the trainer parameters do. It seems a bit redundant with the checkpoint settings.

ml-agents/mlagents/trainers/learn.py

ml-agents/mlagents/trainers/model_saver/torch_model_saver.py

vincentpierre · 2021-09-08T19:54:17Z

ml-agents/mlagents/trainers/trainer/trainer_factory.py

-        if init_path is not None:
-            trainer_settings.init_path = os.path.join(init_path, brain_name)


Have you tried using init_path and see if it does what we expect out of it?
Reading a bit of the code in torch_model_saver.py it looks like setting init_path in the trainer settings initializes the policy from a checkpoint. Is that correct?

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

vincentpierre

I approve of the code.
Got one question before you merge. If the user fills both init_path and the checkpoint_list. Which one gets priority.
(Please add documentation and changelog before merging)

ml-agents/mlagents/trainers/tests/test_trainer_util.py

com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

* init from any checkpoint including older ones * moving init_path logic ahead to learn.py * fixing pytest to take the full path * doc & changelog

maryamhonari added 4 commits September 1, 2021 10:40

init: custom checkpoints for multiple brains

2d63280

moving init_path logic ahead to learn.py

3c38be2

renaming, prioritize resume over initialize

736846a

formatting

f0ba260

maryamhonari changed the title ~~[WIP] Initialize-from checkpoints~~ [WIP] Initialize-from custom checkpoints Sep 2, 2021

maryamhonari added 3 commits September 8, 2021 10:32

add type checking

2b9ffb3

fixing pytest to take the full path

4361ae7

clean up

1b39d89

maryamhonari commented Sep 8, 2021

View reviewed changes

ml-agents/mlagents/trainers/model_saver/torch_model_saver.py Outdated Show resolved Hide resolved

clean up

ef9869c

maryamhonari marked this pull request as ready for review September 8, 2021 19:20

maryamhonari requested review from vincentpierre and sini September 8, 2021 19:22

vincentpierre suggested changes Sep 8, 2021

View reviewed changes

maryamhonari and others added 4 commits September 8, 2021 13:03

Update ml-agents/mlagents/trainers/learn.py

9fbb7ed

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

Update ml-agents/mlagents/trainers/learn.py

b0c2361

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

addressing comments:rename,docstring,constant

fcecd76

update init_path in behaviour

fa38aa8

vincentpierre self-requested a review September 10, 2021 16:03

vincentpierre reviewed Sep 10, 2021

View reviewed changes

maryamhonari added 3 commits September 10, 2021 10:10

update init_path in behaviour

8c19be4

add tests for setup_init_path

0c54731

formatting

0caf020

vincentpierre reviewed Sep 10, 2021

View reviewed changes

ml-agents/mlagents/trainers/tests/test_trainer_util.py Show resolved Hide resolved

doc & changelog

308b640

maryamhonari changed the title ~~[WIP] Initialize-from custom checkpoints~~ Initialize-from custom checkpoints Sep 10, 2021

doc & changelog

e693210

vincentpierre approved these changes Sep 14, 2021

View reviewed changes

com.unity.ml-agents/CHANGELOG.md Outdated Show resolved Hide resolved

update changelog

3981e10

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

maryamhonari merged commit 6b2c127 into main Sep 14, 2021

delete-merged-branch bot deleted the develop-init-custom-checkpoint branch September 14, 2021 23:33

maryamhonari added a commit that referenced this pull request Sep 24, 2021

Initialize-from custom checkpoints (#5525)

676b3a6

* init from any checkpoint including older ones * moving init_path logic ahead to learn.py * fixing pytest to take the full path * doc & changelog

sini pushed a commit that referenced this pull request Sep 29, 2021

Initialize-from custom checkpoints (#5525)

938d03a

* init from any checkpoint including older ones * moving init_path logic ahead to learn.py * fixing pytest to take the full path * doc & changelog

github-actions bot locked as resolved and limited conversation to collaborators Sep 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Initialize-from custom checkpoints #5525

Initialize-from custom checkpoints #5525

Uh oh!

maryamhonari commented Sep 1, 2021 •

edited

Loading

Uh oh!

maryamhonari Sep 8, 2021 •

edited

Loading

Uh oh!

vincentpierre Sep 8, 2021

Uh oh!

maryamhonari Sep 8, 2021

Uh oh!

Uh oh!

vincentpierre left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vincentpierre Sep 8, 2021

Uh oh!

vincentpierre left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		if init_path is not None:
		trainer_settings.init_path = os.path.join(init_path, brain_name)

Initialize-from custom checkpoints #5525

Initialize-from custom checkpoints #5525

Uh oh!

Conversation

maryamhonari commented Sep 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

Uh oh!

maryamhonari Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vincentpierre Sep 8, 2021

Choose a reason for hiding this comment

Uh oh!

maryamhonari Sep 8, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vincentpierre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vincentpierre Sep 8, 2021

Choose a reason for hiding this comment

Uh oh!

vincentpierre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maryamhonari commented Sep 1, 2021 •

edited

Loading

maryamhonari Sep 8, 2021 •

edited

Loading