add legacy load utility #9166

awaelchli · 2021-08-27T10:10:55Z

What does this PR do?

Splits changes off #8558

Adds a context manager to handle legacy checkpoints that have pickled attributes in the content that was removed from lightning in new versions. The currently known ones are:

pytorch_lightning.utilities.argparse_utils (renamed, slated for removal)
pytorch_lightning.utilities.argparse._gpus_arg_default (dead code)

We can remove these dead code pieces and instead dynamically patch modules to re-route the imports.
The following works:

    with pl_legacy_patch():
        # this would normally result in ImportError
        from pytorch_lightning.utilities.argparse import _gpus_arg_default

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

I made sure I had fun coding 🙃

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

for more information, see https://pre-commit.ci

…ix/callback-state

wrap

for more information, see https://pre-commit.ci

awaelchli · 2021-09-01T12:45:38Z

shall include such case to testing of legacy checkpoints?

@Borda sorry I missed your question.
Are you suggesting to add a 1.2.8 legacy checkpoint to https://pl-public-data.s3.amazonaws.com/legacy/checkpoints.zip ?

Of course, I made sure that the legacy patch works with the old checkpoints. Below is the code I used to verify it.
I can provide the checkpoint file and we can make a test for it but seems a bit overkill imo.

Since we know what the legacy format was, the unit tests that I added should suffice.

import torch
from argparse import ArgumentParser
from torch.utils.data import Dataset

from pytorch_lightning import LightningModule, Trainer


class RandomDataset(Dataset):
    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len


class BoringModel(LightningModule):
    def __init__(self, **kwargs):
        super().__init__()
        self.save_hyperparameters(kwargs)
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def loss(self, batch, prediction):
        # An arbitrary loss to have a loss that updates the model weights during `Trainer.fit` calls
        return torch.nn.functional.mse_loss(prediction, torch.ones_like(prediction))

    def step(self, x):
        x = self.layer(x)
        out = torch.nn.functional.mse_loss(x, torch.ones_like(x))
        return out

    def training_step(self, batch, batch_idx):
        output = self.layer(batch)
        loss = self.loss(batch, output)
        return {"loss": loss}

    def training_step_end(self, training_step_outputs):
        return training_step_outputs

    def training_epoch_end(self, outputs) -> None:
        torch.stack([x["loss"] for x in outputs]).mean()

    def configure_optimizers(self):
        optimizer = torch.optim.SGD(self.layer.parameters(), lr=0.1)
        lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1)
        return [optimizer], [lr_scheduler]


def run_1_2_7():
    # RUN WITH LIGHTNING 1.2.7

    # fake data
    train_data = torch.utils.data.DataLoader(RandomDataset(32, 64))

    # model
    parser = ArgumentParser()
    parser = Trainer.add_argparse_args(parser)
    args = parser.parse_args()
    trainer = Trainer.from_argparse_args(args, default_root_dir="legacy_logs1.2.7", max_steps=1)
    model = BoringModel(**vars(args))
    trainer.fit(model, train_data)
    print(trainer.checkpoint_callback.best_model_path)


def run_master():
    # RUN WITH LIGHTNING 1.2.7
    from pytorch_lightning.utilities.migration import pl_legacy_patch

    with pl_legacy_patch():  # without this, pickle error!
        # can unpickle!
        x = torch.load("legacy_logs1.2.7/lightning_logs/version_2/checkpoints/epoch=0-step=0.ckpt")
        assert callable(x["hyper_parameters"]["gpus"])


if __name__ == "__main__":
    run_master()

Borda

great solution and pattern for future similar changes!
@awaelchli mind resolve conflicts? 🐰

for more information, see https://pre-commit.ci

…nto feature/callback-state/legacy

awaelchli and others added 30 commits April 8, 2021 03:06

class name as key

89131c2

string state identifier

63fb983

add legacy state loading

7dc218a

update test

04b588b

update tests

bb11e28

Merge branch 'master' into bugfix/callback-state

271360c

Merge branch 'master' into bugfix/callback-state

20b66f0

Merge branch 'master' into bugfix/callback-state

f585a28

Merge branch 'master' into bugfix/callback-state

e1d518b

Merge branch 'master' into bugfix/callback-state

880066b

flake8

0259ecb

add test

d56e5e4

Merge branch 'master' into bugfix/callback-state

24a2cc8

Merge branch 'master' into bugfix/callback-state

81b5e36

Apply suggestions from code review

72ba440

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

improve test

79d8568

flake

d9ea8c1

Merge branch 'master' into bugfix/callback-state

98f7fe6

Merge branch 'master' into bugfix/callback-state

68f571c

fix merge

0851f0d

[pre-commit.ci] auto fixes from pre-commit.com hooks

82d5658

for more information, see https://pre-commit.ci

use qualname

334fd4a

Merge remote-tracking branch 'origin/bugfix/callback-state' into bugf…

090b169

…ix/callback-state

rename state_id

f144fd1

fix diff

6154986

suboptimal

17459fe

clean up

f84d047

migrate checkpoint on load

d4dc3b3

wrap

update legacy patching

2620069

[pre-commit.ci] auto fixes from pre-commit.com hooks

97cba29

for more information, see https://pre-commit.ci

Merge branch 'master' into feature/callback-state/legacy

ba8af7a

mergify bot added ready PRs ready to be merged has conflicts and removed has conflicts labels Aug 30, 2021

Merge branch 'master' into feature/callback-state/legacy

7d7d401

mergify bot added has conflicts and removed has conflicts labels Aug 31, 2021

Merge branch 'master' into feature/callback-state/legacy

3578e6f

mergify bot added has conflicts and removed has conflicts labels Sep 1, 2021

awaelchli mentioned this pull request Sep 9, 2021

checkpoint migration #9396

Closed

stale bot added the won't fix This will not be worked on label Sep 16, 2021

awaelchli added this to the v1.5 milestone Sep 16, 2021

stale bot removed the won't fix This will not be worked on label Sep 16, 2021

Borda approved these changes Sep 23, 2021

View reviewed changes

Lightning-AI deleted a comment from stale bot Sep 23, 2021

Merge branch 'master' into feature/callback-state/legacy

bf1fc8a

mergify bot removed the has conflicts label Sep 23, 2021

awaelchli added 3 commits September 23, 2021 10:41

update changelog

bed8808

rm notebook

a4a8eb4

update notebook

20ab34a

awaelchli force-pushed the feature/callback-state/legacy branch from cfefee2 to 20ab34a Compare September 23, 2021 08:45

pre-commit-ci bot and others added 3 commits September 23, 2021 08:46

[pre-commit.ci] auto fixes from pre-commit.com hooks

dd4532a

for more information, see https://pre-commit.ci

move context manager to _load_and_validate method

fae11ff

Merge remote-tracking branch 'origin/feature/callback-state/legacy' i…

3e43cb1

…nto feature/callback-state/legacy

awaelchli enabled auto-merge (squash) September 23, 2021 08:48

awaelchli merged commit 87b11fb into master Sep 23, 2021

awaelchli deleted the feature/callback-state/legacy branch September 23, 2021 09:52

jdhorwood mentioned this pull request Apr 27, 2022

Issues reloading model checkpoints in PL >= 1.6 datamol-io/graphium#108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add legacy load utility #9166

add legacy load utility #9166

awaelchli commented Aug 27, 2021 •

edited

Loading

awaelchli commented Sep 1, 2021 •

edited

Loading

Borda left a comment •

edited

Loading

add legacy load utility #9166

add legacy load utility #9166

Conversation

awaelchli commented Aug 27, 2021 • edited Loading

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Did you have fun?

awaelchli commented Sep 1, 2021 • edited Loading

Borda left a comment • edited Loading

Choose a reason for hiding this comment

awaelchli commented Aug 27, 2021 •

edited

Loading

awaelchli commented Sep 1, 2021 •

edited

Loading

Borda left a comment •

edited

Loading