Experimental hf logger #1456

Wauplin · 2023-04-27T13:57:06Z

Following this slack thread (private) I started to prototype a logger to push training data to the Hub.

EDIT: I removed the Mixin and only add a HFSummaryWriter that does the does. Let's optimize for our use case first and revisit later if we want to generalize. The logger inherits from tensorboardX.SummaryWriter.

EDIT 2: updated again to integrate CommitScheduler. No need for regular .push_to_hub, the logger automatically takes care of it. Also makes the commit more robust to concurrent read/write. Upload happens asynchronously in a background threads. Uploads are queued to avoid concurrency issues.

I ran a MNIST training to get a toy example. I took the script from this article and simply replaced

writer = SummaryWriter()

by

writer = HFSummaryWriter(repo_id="test-logger-MNIST", commit_every=1)

And that's it! It got me this tensorboard on the Hub.

HuggingFaceDocBuilderDev · 2023-04-27T14:03:07Z

The documentation is not available anymore as the PR was closed or merged.

julien-c

quite cool!

One suggestion is that maybe we should not just publish the mixin but also directly the mixed-in implem(s) for the most famous loggers, i.e. also export the

class TensorBoardHFLogger(HFLoggerMixin, SummaryWriter):
    pass

WDYT?

julien-c · 2023-04-27T16:09:12Z

src/huggingface_hub/_training_logger.py

+
+    Example:
+        ```py
+        from lightning.pytorch.loggers import TensorBoardLogger


i think maybe better to use the "official" tensorboard one as an example?

(i think it's SummaryWriter? not sure anymore tbh, it's been a while)

Yeah that's definitely where I need feedback from people actually knowing about it! I'm up with integrating it with any logger now that the base work is done.

Wauplin · 2023-04-27T16:27:39Z

One suggestion is that maybe we should not just publish the mixin but also directly the mixed-in implem(s) for the most famous loggers

Yup that would make sense, agree we should do that. I wanted to at first but then realized that both lightning.pytorch.loggers.TensorBoardLogger and lightning.fabric.loggers.TensorBoardLogger exist and have slightly different usage so I let the user decide. But if we can select a few (or even 1) loggers that are popular, that's even better (and why not expose the Mixin for power users if they really want it).

Wauplin · 2023-04-28T06:56:03Z

@julien-c I've updated the PR description to showcase how it works with the base SummaryWriter (and it works exactly the same, out-of-the-box). Here is the generated tensorboard: https://huggingface.co/Wauplin/test_hf_logger/tensorboard. I think exposing both the main HFTensorBoardLogger (that encapsulate SummaryWriter) and the mixin would be best to let users decide if they want a default solution or a flexible one.

Before moving forward on this PR, I'd be very nice to extensively test it on real trainings. I'm a bit "afraid" on how it can behave when the log_dir becomes large. There is also the limitation of running in the main thread or in the background. @thomwolf can I let you see with your team and get back to me if you're interested/if someone's testing it?

julien-c · 2023-04-28T11:03:58Z

I think exposing both the main HFTensorBoardLogger (that encapsulate SummaryWriter) and the mixin would be best

100% agree

Wauplin · 2023-05-19T10:18:37Z

Made an update to use run_as_future=True (#1458). Data is now pushed in the background without stopping the training.

codecov · 2023-05-19T10:30:11Z

Codecov Report

Patch coverage: 6.81% and project coverage change: +29.92 🎉

Comparison is base (8eabd33) 51.76% compared to head (73f868a) 81.68%.

❗ Current head 73f868a differs from pull request most recent head e5477c0. Consider uploading reports for the commit e5477c0 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1456       +/-   ##
===========================================
+ Coverage   51.76%   81.68%   +29.92%     
===========================================
  Files          56       56               
  Lines        6047     5957       -90     
===========================================
+ Hits         3130     4866     +1736     
+ Misses       2917     1091     -1826

Impacted Files	Coverage Δ
src/huggingface_hub/__init__.py	`75.75% <ø> (ø)`
src/huggingface_hub/_tensorboard_logger.py	`0.00% <0.00%> (ø)`
src/huggingface_hub/utils/__init__.py	`100.00% <ø> (ø)`
src/huggingface_hub/utils/_runtime.py	`55.71% <40.00%> (+4.60%)`	⬆️
src/huggingface_hub/utils/_experimental.py	`100.00% <100.00%> (+25.00%)`	⬆️

... and 37 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

thomasw21 · 2023-05-25T12:43:18Z

src/huggingface_hub/_training_logger.py

+            def log_dir(self) -> str:
+                return self.logdir
+
+        logger = HFTensorBoardLogger("training/results", repo_id="test_hf_logger", path_in_repo="tensorboard")


Would it be weird not to assign a log_dir? I think it's confusing that we need to point both somewhere on the hub and somewhere on the disk. Shouldn't it write on a temporary disk instead? so that basically the hub acts as the source of truth?

I have not thought of this use case before TBH but it makes total sense. At first I wanted the logger to be as close as possible to the default logger (i.e. write locally as well). But yes we can make log_dir optional and if not provided we create a temporary directory.

@thomasw21 actually by default SummaryWriter creates a local directory under /runs. I will keep this behavior as it is instead of creating a temporary directory by default. If we really want a tmp dir in the future, we can always add it as a new feature.

Wauplin added 3 commits April 27, 2023 14:57

first draft for an hflogger

b951252

move to huggingface_hub

ebf26a3

copyright

8c6fb70

julien-c reviewed Apr 27, 2023

View reviewed changes

julien-c mentioned this pull request Apr 27, 2023

Allow append mode in HfFileSystem._open #1431

Closed

Add default tensorboard logger

2a21260

Wauplin added 2 commits May 19, 2023 12:09

Merge branch 'main' into experimental-hf-logger

6d4a600

leverage to push to hub in the background

5ad9b70

Push ot hub on context manager closing

c7777f7

thomasw21 reviewed May 25, 2023

View reviewed changes

Wauplin added 3 commits May 31, 2023 11:34

Merge branch 'main' into experimental-hf-logger

940039c

remove clunky HFLoggerMixin and focus on tensorboard

1448e58

mark HFSummaryWriter as experimental

628ac16

Wauplin changed the title ~~[Draft] Experimental hf logger~~ Experimental hf logger May 31, 2023

Wauplin marked this pull request as ready for review May 31, 2023 11:12

Wauplin added 9 commits May 31, 2023 16:05

Merge branch 'main' into experimental-hf-logger

7151315

Merge branch 'main' into experimental-hf-logger

73f868a

Merge branch 'main' into experimental-hf-logger

c578b18

automatic push in HFSummaryWriter

e5477c0

simplify logic if only 1 LFS file

5d3cd7c

more verbose

9460470

Append logdir to path_in_repo

1fec8ed

add apckage reference

a5d3bba

fix typing

4c580a1

Wauplin added 2 commits June 7, 2023 12:13

fix kwargs in doc

f5acc2a

fix type

0e45cbc

Wauplin requested review from julien-c, LysandreJik and thomasw21 June 7, 2023 10:43

Wauplin added 2 commits June 12, 2023 17:07

Merge branch 'main' into experimental-hf-logger

6530a37

Merge branch 'main' into experimental-hf-logger

b27bc0f

Wauplin merged commit b4e01b0 into main Jun 13, 2023

Wauplin deleted the experimental-hf-logger branch June 13, 2023 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental hf logger #1456

Experimental hf logger #1456

Wauplin commented Apr 27, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 27, 2023 •

edited

Loading

julien-c left a comment

julien-c Apr 27, 2023

julien-c Apr 27, 2023

Wauplin Apr 27, 2023

Wauplin commented Apr 27, 2023

Wauplin commented Apr 28, 2023

julien-c commented Apr 28, 2023

Wauplin commented May 19, 2023

codecov bot commented May 19, 2023 •

edited

Loading

thomasw21 May 25, 2023

Wauplin May 25, 2023

Wauplin May 31, 2023

Experimental hf logger #1456

Experimental hf logger #1456

Conversation

Wauplin commented Apr 27, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Apr 27, 2023 • edited Loading

julien-c left a comment

Choose a reason for hiding this comment

julien-c Apr 27, 2023

Choose a reason for hiding this comment

julien-c Apr 27, 2023

Choose a reason for hiding this comment

Wauplin Apr 27, 2023

Choose a reason for hiding this comment

Wauplin commented Apr 27, 2023

Wauplin commented Apr 28, 2023

julien-c commented Apr 28, 2023

Wauplin commented May 19, 2023

codecov bot commented May 19, 2023 • edited Loading

Codecov Report

thomasw21 May 25, 2023

Choose a reason for hiding this comment

Wauplin May 25, 2023

Choose a reason for hiding this comment

Wauplin May 31, 2023

Choose a reason for hiding this comment

Wauplin commented Apr 27, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 27, 2023 •

edited

Loading

codecov bot commented May 19, 2023 •

edited

Loading