Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing training_step outputs in training_epoch_end #2327

Conversation

mmiakashs
Copy link

@mmiakashs mmiakashs commented Jun 23, 2020

Possible bug fix of #2320

…ss all the batch outputs to training_epoch_end(if user defined this method)
@mergify mergify bot requested a review from a team June 23, 2020 04:58
@Borda Borda added the bug Something isn't working label Jun 23, 2020
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind add a test for this case? probably some simple example from #2320

@mergify mergify bot requested a review from a team June 23, 2020 05:17
@codecov
Copy link

codecov bot commented Jun 23, 2020

Codecov Report

Merging #2327 into master will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #2327   +/-   ##
======================================
  Coverage      88%     88%           
======================================
  Files          70      70           
  Lines        5501    5503    +2     
======================================
+ Hits         4834    4836    +2     
  Misses        667     667           

@Borda Borda added the good first issue Good for newcomers label Jun 23, 2020
@Borda Borda added the ready PRs ready to be merged label Jun 23, 2020
@williamFalcon
Copy link
Contributor

williamFalcon commented Jun 23, 2020

@mmiakashs mind trying master now? the solution in this PR wasn't quite 100% right and needed more testing.

This PR is likely not needed anymore but we need to add you as co-author to #2328 @Borda

@mergify
Copy link
Contributor

mergify bot commented Jun 23, 2020

This pull request is now in conflict... :(

@Borda
Copy link
Member

Borda commented Jun 23, 2020

This PR is likely not needed anymore but we need to add you as co-author to #2328 @Borda

to be done, this PR shall be merged to the other PR, but as this is closed and the other merged, there is nothing to do... pls ping me next time before close/merge 🐰

@williamFalcon
Copy link
Contributor

yes... but this PR was incorrect

@mmiakashs
Copy link
Author

@mmiakashs mind trying master now? the solution in this PR wasn't quite 100% right and needed more testing.

This PR is likely not needed anymore but we need to add you as co-author to #2328 @Borda

@williamFalcon Thanks a lot for the PR. One confusion: I just noticed that all the training_step end log metrics are combined with the dict key named 'log_metrics', however, the validation log metrics are combined with the dict key named 'log'. Is this variation intentional?

@mmiakashs
Copy link
Author

@williamFalcon I debug again and found out that the issue #2320 still occurred only for training_step outputs. training_step outputs for the first optimizer iteration are missing, however, the second optimizer iteration outputs are merged properly.

@Borda
Copy link
Member

Borda commented Jun 25, 2020

@mmiakashs do you see a fix for it, mind send a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants