Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Move logger v_num information out of progress bar and into simple log statement #12175

Open
daniellepintz opened this issue Mar 2, 2022 · 6 comments
Labels
Milestone

Comments

@daniellepintz
Copy link
Contributor

daniellepintz commented Mar 2, 2022

Proposed refactor

Remove v_num from the progress bar here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/5da065e287b44e2c1fe4f7951003813ed45365c9/pytorch_lightning/callbacks/progress/base.py#L217-L223

And add a simple Python log statement at the start of training with the version number for each logger.

Motivation

  1. It is a bit unclear to me why we have the logger v_num on the progress bar - there is no direct relation between the two. Why not have this information in a simple log statement at the start of training, and leave more space in the progress bar for other information? As @carmocca put it, the progress bar is "prime real estate".

  2. When we have multiple loggers, we have the strange behavior of concatenating the two version numbers (relic of LoggerCollection) and if the concatenated version numbers is longer than 4 digits we truncate it:
    https://github.com/PyTorchLightning/pytorch-lightning/blob/5da065e287b44e2c1fe4f7951003813ed45365c9/pytorch_lightning/callbacks/progress/base.py#L220-L222

Instead, we can just have log statements like this at the start of training:

Initialized TensorBoardLogger with version X.
Initialized WandbLogger with version Y.

cc @justusschock @awaelchli @rohitgr7 @edward-io @Borda @ananthsub @kamil-kaczmarek @Raalsky @Blaizzy @carmocca

@daniellepintz daniellepintz added refactor logger Related to the Loggers progress tracking (internal) Related to the progress tracking dataclasses labels Mar 2, 2022
@daniellepintz daniellepintz added this to the 1.6 milestone Mar 2, 2022
@daniellepintz daniellepintz changed the title Move logger v_num information out of progress bar and into simple log statement [RFC] Move logger v_num information out of progress bar and into simple log statement Mar 2, 2022
@daniellepintz daniellepintz self-assigned this Mar 2, 2022
@daniellepintz daniellepintz moved this to Waiting in Lightning RFCs Mar 2, 2022
@semaphore-egg
Copy link
Contributor

semaphore-egg commented Mar 3, 2022

If my guess is correct:

My understanding:

When multiple training tasks are launched and we want to stop one of them based on the logging information, we need to idenfity which console corresponding to a target log/tensorboard.

Without an identifier, all consoles/terminals will look the same.
Of course, we can locate the console based on the training process (current batch index), but that might not be accurate if multiple task progresses are similar.

My Concern for proposed refactor:

Log statements at the start of training has some limitations: If too many information printed in the console, it can be diffcult to find out something printed in the beginning of training process. When
virtual terminal, e.g., screen command is used, it is impossible to scroll up to find out previous prints.

My Suggestion:

However, I feel v_num does not convey an obvious meaning. Something like training ID, logger ID or training version might be more clear for us, although they are too long for our precious single-line space.

Since this v_num is fixed in a training process, moving it to the begining of a line might be another choice. In this way, I prefer to totally remove v_num= and warp the value with a bracket.
From:
Epoch 0: 100%|########| 50/50 [00:25<00:00, 2.00it/s, loss=0.0468, v_num=7a22]
To:
[7a22] Epoch 0: 100%|########| 50/50 [00:25<00:00, 2.00it/s, loss=0.0468]

This modification follows the idea that: fixed items come first, mutable values follow.

@carmocca carmocca added progress bar: tqdm and removed progress tracking (internal) Related to the progress tracking dataclasses labels Mar 3, 2022
@daniellepintz
Copy link
Contributor Author

@semaphore-egg thanks for your input! @awaelchli @carmocca do you have any thoughts?

@awaelchli
Copy link
Contributor

I agree with @semaphore-egg and his/her understanding is correct. We have the v_num for exactly this reason. I agree v_num is just a different name for "ver" or "version", so why not call it version directly? A removal in my opinion is not desired, as I have expressed in the past. I consider this a good default. @carmocca is in favor of removing it.

@carmocca
Copy link
Contributor

it can be diffcult to find out something printed in the beginning of training process

But this is a problem of the tool used to launch the script (e.g. screen). I wouldn't design our interface around their specific papercuts considering all better alternatives that exist today.

prefer to totally remove v_num= and warp the value with a bracket.

Note that doing this or renaming it to version or ver doesn't address the point that the progress bar row length is very limited so it's better to move out everything we can so log(prog_bar=True) values can have more space.

@carmocca is in favor of removing it.

Yes 😁

@carmocca
Copy link
Contributor

A removal in my opinion is not desired, as I have expressed in the past

Note that the OP advocates for moving it, not removing it. Did you mean that moving it off the pbar is also not desired?

@carmocca carmocca modified the milestones: 1.6, future Mar 21, 2022
@daniellepintz
Copy link
Contributor Author

daniellepintz commented Mar 21, 2022

Thanks @awaelchli and @carmocca for providing your thoughts. I agree with @carmocca and still think we should move this out of the pbar.

When multiple training tasks are launched and we want to stop one of them based on the logging information, we need to idenfity which console corresponding to a target log/tensorboard.

I could be wrong, but this seems to me like not a super common case, and when this is the case the user can still easily add the logger version to the pbar. That is why it is customizable to the user's preferences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Waiting
Development

No branches or pull requests

4 participants