Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass epoch outputs to callback hooks on_validation_epoch_end and on_test_epoch_end #5508

Closed
rsaite opened this issue Jan 14, 2021 · 11 comments · Fixed by #6120
Closed

Pass epoch outputs to callback hooks on_validation_epoch_end and on_test_epoch_end #5508

rsaite opened this issue Jan 14, 2021 · 11 comments · Fixed by #6120
Assignees
Labels
design Includes a design discussion feature Is an improvement or enhancement help wanted Open to be worked on
Milestone

Comments

@rsaite
Copy link

rsaite commented Jan 14, 2021

🚀 Feature

The callback hook on_validation_epoch_end(self, trainer, pl_module) should receive the outputs of the validation epoch, changing its signature to on_validation_epoch_end(self, trainer, pl_module, outputs), same for the hook on_test_epoch_end. This functionality has already been implemented for on_train_epoch_end.

Motivation

I have different models where I want to do the same things with the validation/test outputs after each epoch: write predictions out to a dataframe and create plots based on the outputs. Separating this functionality from my modules by using a callback (instead of polluting all my modules with this code) seems to be the proper way.

This feature request is related to the open issues #4689 and #4369 concerning a bug in on_train_epoch_end.

@rsaite rsaite added feature Is an improvement or enhancement help wanted Open to be worked on labels Jan 14, 2021
@github-actions
Copy link
Contributor

Hi! thanks for your contribution!, great first issue!

@hackgoofer
Copy link

#4369 fixes it!

@rsaite
Copy link
Author

rsaite commented Jan 15, 2021

No, it does not. #4369 fixes an existing bug in the hook on_train_epoch_end s.t. it correctly receives the training outputs.

This issue requests that the validation and test outputs should be passed to the respective hooks on_validation_epoch_end and on_test_epoch_end. This is currently not implemented and is not resolved by #4369.

@stale
Copy link

stale bot commented Feb 14, 2021

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label Feb 14, 2021
@kaushikb11 kaushikb11 self-assigned this Feb 15, 2021
@stale stale bot removed the won't fix This will not be worked on label Feb 15, 2021
@kaushikb11
Copy link
Contributor

@Borda @tchaton Is there a reason why we don't pass epoch outputs to on_validation_epoch_end and on_test_epoch_end? Or something on the roadmap?

@ananthsub
Copy link
Contributor

@kaushikb11 @rsaite just curious, why didn't validation_epoch_end or test_epoch_end inside the LightningModule work for you? these had the outputs passed to them before. Are you doing different post-processing in each of these sets of hooks? See #6865 for more context

@rsaite
Copy link
Author

rsaite commented Apr 7, 2021

Using validation_epoch_end or test_epoch_end might have worked. I'm doing model-specific post-processing and logging in these methods and the callback is model-independent and just joins all the outputs with a dataframe and dumps them to a file. So, it probably would've been possible to do something like

class SuperModel(LightningModule):
    def test_epoch_end(self, outputs):
        # write outputs to df

class Model(SuperModel):
    def test_epoch_end(self, outputs):
        # model specific post-processing
        super().test_epoch_end(outputs)

or even (to avoid having to include this super call all the time)

class SuperModel(LightningModule):
    def test_epoch_end(self, outputs):
        self.model_specific_test_epoch_end(outputs)
        # write outputs to df

    def model_specific_test_epoch_end(self, outputs):
        raise NotImplementedError()

class Model(SuperModel):
    def model_specific_test_epoch_end(self, outputs):
        # model specific post-processing

Maybe there's a more elegant solution, but it always seems to require messing around with the implementation of the LightningModule. Ultimately, it's a design choice and from reading the Lightning documentation I got the impression that

Separating this functionality from my modules by using a callback (instead of polluting all my modules with this code) seems to be the proper way.

@ghost
Copy link

ghost commented Nov 19, 2021

I would also find this very helpful.

@daeseoklee
Copy link

It seems you can use the hook on_validation_batch_end to accumulate the outputs separately in your callbacks and use them in on_validation_epoch_end. But I agree that having the outputs as an argument of on_validation_epoch_end would be cleaner.

@rsaite
Copy link
Author

rsaite commented Dec 9, 2021

It seems you can use the hook on_validation_batch_end to accumulate the outputs separately in your callbacks and use them in on_validation_epoch_end.

Yes, I think this is also what I ended up doing and (I realized that now) probably the preferred way.

But I agree that having the outputs as an argument of on_validation_epoch_end would be cleaner.

As far as I understand, it's due to performance reasons when you don't want to cache all the outputs, see #7338 #7339.

@ZeyuSun
Copy link

ZeyuSun commented Jul 3, 2022

I would also find this very helpful. I have lots of codes that are analyze the output and plot and log. Those codes are not depending on specific models. It would be nice to pull them out as a callback (similar to what rsaite said above in #5508 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Includes a design discussion feature Is an improvement or enhancement help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants