-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use .comet.config file for CometLogger #1913
Conversation
Hello @neighthan! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-08-07 07:46:22 UTC |
This pull request is now in conflict... :( |
Codecov Report
@@ Coverage Diff @@
## master #1913 +/- ##
=======================================
+ Coverage 89% 90% +1%
=======================================
Files 79 79
Lines 7302 7197 -105
=======================================
- Hits 6514 6501 -13
+ Misses 788 696 -92 |
it seems that it is not back-compatible with older Comet version... |
@neighthan ping :) |
Sorry for the delay; I just updated the comet_ml version in |
pytorch_lightning/loggers/comet.py
Outdated
raise MisconfigurationException("CometLogger requires either api_key or save_dir during initialization.") | ||
# for backwards-compatibility, we have to check api_key then save_dir and | ||
# only then use ~/.comet.config or the environment variable | ||
api_key = get_api_key(None, get_config()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be checked above before if api_key is not None:
to avoid duplicate code below.
if api_key is None:
api_key = get_api_key(None, get_config())
Will it break some backward-compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to preserve the previous behavior of
- If you pass in an API key, it is used (regardless of
save_dir
) - If no API key is given but
save_dir
is, an offline experiment is used
Step 3. before would have been to error; now step 3 is to use the API key from the environment / the config file if given and only error if not. If we load the API key in at the start, then behavior 2. can't be done anymore (i.e. if I have a ~/.comet_config
but still want the experiment to be saved locally, I could pass in save_dir
and that would be used instead). I like keeping explicit local saving as an option, so if we're going to change the behavior, I'd suggest
- if
save_dir
is given, do an offline experiment (regardless of API key status; breaks backwards-compatibility) - if not, use the given API key if there is one or load it from the env variable / file if not
- if none of these work, throw the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah! the suggested workflow looks like a better way to make this work, although the current one is also right but not optimal(just repeated code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about throwing an error if save_dir
and api_key
are both given? This also breaks backwards-compatibility but not silently. By disallowing this, we remove potential confusion about which argument has precedence if both are given. And then we can proceed with the second option above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was thinking the same. For now, api_key is given priority if both are given. As a user, I don't think I will bother to give a save_dir even if I want to run it in online mode. I will give save_dir only when I want to run it in offline mode even if I have a key already present in my system or I will give an api_key. So I think the suggested flow looks better to me atleast but you should talk to one of the core-contributor first before changing it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
save_dir
actually has another effect as well - setting it will change where the model weights are saved. I just ran into a case where I was doing offline experiments and using save_dir
and then wanted to switch to online ones, but I think it's more inconvenient and error-prone to have to change the save directory somewhere else when I already have save_dir
set up for that. Thus I think it actually could make sense to specify save_dir
even if you want online-mode. What about adding a another argument to specify whether the experiment should be online?
- If
save_dir
is given and no API key can be found, we can infer it should be offline. - If
save_dir
is not given, then it's online. - If desired, explicitly giving an API key could lead to online mode.
- If an API key can be found (e.g. in
~/.comet.config
) butsave_dir
is also given, then we use the new argument to determine whether to be online or not. This lets users usesave_dir
to control where checkpoints are stored even when doing online experiments and lets users do offline experiments despite having a~/.comet.config
.
We'd also want to change it so that save_dir
is always saved as an instance variable (instead of only if no API key is given).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, if save_dir
is None
then in case of online more where are models saved?? I think if someone wants to explicitly specify the model weights directory, one should use ModelCheckpoint rather than specifying the model directory in logger instance.
Status summary: this should all work as-is and be backwards-compatible right now (the config file is only used in cases where an error would have been thrown before). The only remaining thing that I see right now is to decide whether we want to change how Alternative 0 (current behavior) -
Alternative 1 -
Alternative 2 -
I prefer alternative 2 because I like using |
I would also go with Alternative 2 🐰 @PyTorchLightning/core-contributors |
I'd also prefer Option 2 |
Okay, I just pushed an implementation of Alternative 2. I set I'm not sure why that TPU test is failing, but I couldn't see how it would be related to this PR. |
This adds a new argument to the logger to control the online / offline mode explicitly so that if you give an API key and a save_dir (e.g. to control where checkpoints go while having ~/.comet.config) you can specify which mode you want.
For consistency with other loggers.
@neighthan I have rebased on master, let's finish it... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition
Before submitting
What does this PR do?
Fixes #1810.