-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental hf logger #1456
Experimental hf logger #1456
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quite cool!
One suggestion is that maybe we should not just publish the mixin but also directly the mixed-in implem(s) for the most famous loggers, i.e. also export the
class TensorBoardHFLogger(HFLoggerMixin, SummaryWriter):
pass
WDYT?
|
||
Example: | ||
```py | ||
from lightning.pytorch.loggers import TensorBoardLogger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think maybe better to use the "official" tensorboard one as an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i think it's SummaryWriter? not sure anymore tbh, it's been a while)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's definitely where I need feedback from people actually knowing about it! I'm up with integrating it with any logger now that the base work is done.
Yup that would make sense, agree we should do that. I wanted to at first but then realized that both |
@julien-c I've updated the PR description to showcase how it works with the base Before moving forward on this PR, I'd be very nice to extensively test it on real trainings. I'm a bit "afraid" on how it can behave when the log_dir becomes large. There is also the limitation of running in the main thread or in the background. @thomwolf can I let you see with your team and get back to me if you're interested/if someone's testing it? |
100% agree |
Made an update to use |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1456 +/- ##
===========================================
+ Coverage 51.76% 81.68% +29.92%
===========================================
Files 56 56
Lines 6047 5957 -90
===========================================
+ Hits 3130 4866 +1736
+ Misses 2917 1091 -1826
☔ View full report in Codecov by Sentry. |
def log_dir(self) -> str: | ||
return self.logdir | ||
|
||
logger = HFTensorBoardLogger("training/results", repo_id="test_hf_logger", path_in_repo="tensorboard") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be weird not to assign a log_dir
? I think it's confusing that we need to point both somewhere on the hub and somewhere on the disk. Shouldn't it write on a temporary disk instead? so that basically the hub acts as the source of truth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not thought of this use case before TBH but it makes total sense. At first I wanted the logger to be as close as possible to the default logger (i.e. write locally as well). But yes we can make log_dir
optional and if not provided we create a temporary directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasw21 actually by default SummaryWriter
creates a local directory under /runs
. I will keep this behavior as it is instead of creating a temporary directory by default. If we really want a tmp dir in the future, we can always add it as a new feature.
Following this slack thread (private) I started to prototype a logger to push training data to the Hub.
EDIT: I removed the Mixin and only add a
HFSummaryWriter
that does the does. Let's optimize for our use case first and revisit later if we want to generalize. The logger inherits fromtensorboardX.SummaryWriter
.EDIT 2: updated again to integrate
CommitScheduler
. No need for regular.push_to_hub
, the logger automatically takes care of it. Also makes the commit more robust to concurrent read/write. Upload happens asynchronously in a background threads. Uploads are queued to avoid concurrency issues.I ran a MNIST training to get a toy example. I took the script from this article and simply replaced
by
And that's it! It got me this tensorboard on the Hub.