-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSVLogger
fails if save_dir
is an s3 path
#16196
Comments
Fixed formatting issues in the description. |
CSVLogger
fails if save_dir
is an s3 path
I also have this issue and I noticed an issue that I think is related. When I run my I happened to have this working in PyCharm so I ran a quick debug session and confirmed that This makes me think that some other subroutine is getting a local filesystem passed and creating the requisite paths. Since S3 is a key-value system you must have some special function to create those empty directories on S3 in order to get that check to pass, right (assuming this worked properly in the past)? I'm very new to lightning (slowly dragging myself away from all of my very old keras code) so I don't know the code base well enough to just dive in and patch this but this is definitely a serious irritant to my workflow. |
Why is this labeled as a feature and not a bug? |
Thank you @Borda. I appreciate that. |
This is considered a feature and not a bug because fsspec support for the csv logger is not implemented. If it was, but it wasn't working properly then we would consider it a bug |
Oh, that's interesting. Maybe the documentation needs to be changed then? I was going based on the Remote Filesystems documentation page which has this example at the top of the page:
If my understanding is correct, that example should use the Thank you very much for the fast PR though. Maybe it's fixing a docs bug and adding a new feature? |
Bug description
Cloud checkpoints are cool! But I also want CSVLogger to periodically write to cloud storage. This doesn't work.
Related bug #16195 . See 'More info' at the bottom of this issue.
There are some related issues:
#14325
#5935
#11769
https://github.com/Lightning-AI/lightning/issues/15539
#2318
#2161
but I haven't found this specifically.
How to reproduce the bug
Here is a google colab that replicates this and a related bag. I share the code for both because it's easier to configure the AWS credentials and see both bugs simultaneously.
Copying and pasting the most important bit (but see the colab for a full minimal replication):
Error messages and logs
Environment
More info
What I really want for christmas this year, all packaged together:
trainer.default_root_dir
that also saves checkpoints to s3.cc @Borda
The text was updated successfully, but these errors were encountered: