Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add str method to datamodule #20301

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

MrWhatZitToYaa
Copy link

@MrWhatZitToYaa MrWhatZitToYaa commented Sep 25, 2024

What does this PR do?

Added a str function to the DataModule in order to be able to print Datasets included in the module instead of an adress.

Fixes #9947

Before submitting
  • Was this discussed/agreed via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist
  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

📚 Documentation preview 📚: https://pytorch-lightning--20301.org.readthedocs.build/en/20301/

@github-actions github-actions bot added the pl Generic label for PyTorch Lightning package label Sep 25, 2024
@MrWhatZitToYaa
Copy link
Author

Will add documentation once code is fixed

@MrWhatZitToYaa MrWhatZitToYaa marked this pull request as ready for review October 3, 2024 15:22
@MrWhatZitToYaa
Copy link
Author

I have been waiting for quite some time. How do I go on? Do I just wait some more until someone looks at this?

@lantiga lantiga changed the title Feature/9947 Add str method to datamodule Add str method to datamodule Nov 12, 2024
@lantiga
Copy link
Collaborator

lantiga commented Nov 12, 2024

hey @MrWhatZitToYaa thanks for taking this up and sorry for the wait

can you add a bit more detail on what do we think it's good printing from a datamodule?

the original design in #9947 was a bit different, as was #9967, but happy to consider alternatives after there's some discussion on the rationale

@lantiga lantiga added waiting on author Waiting on user action, correction, or update lightningdatamodule pl.LightningDataModule labels Nov 12, 2024
@MrWhatZitToYaa
Copy link
Author

hey @MrWhatZitToYaa thanks for taking this up and sorry for the wait

can you add a bit more detail on what do we think it's good printing from a datamodule?

the original design in #9947 was a bit different, as was #9967, but happy to consider alternatives after there's some discussion on the rationale

The idea originally came to me when I had to to use the Datamodule to implement some datasets we have in our lab. It was really annoying to deal with the fact that you never really know what you are dealing with and makes debugging really hard. I talked to some colleagues and they agreed that they ran into the same problem. That's why I would like to add a str functionality for the Datamodule: primarily for debugging / logging.
I think the simplest approach is to scan the whole module for instances of datasets and then return a string where each line contains the name of the dataset and the length if it is available. This would be a big quality of life improvement when dealing with / constructing Datamodules.

@github-actions github-actions bot added docs Documentation related ci Continuous Integration fabric lightning.fabric.Fabric dependencies Pull requests that update a dependency file dockers package labels Nov 20, 2024
Added alternative Boring Data Module implementations
Added test cases for all possible options
Added additional check for NotImplementedError in string function of DataModule
Made changes to comply with requested suggestions
Switched from hardcoded \n to more general os.linesep
lantiga and others added 2 commits November 21, 2024 00:49
Corrected the annotation for the internal function and the list that is suppsoed to store the information on the datasets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration dependencies Pull requests that update a dependency file dockers docs Documentation related fabric lightning.fabric.Fabric lightningdatamodule pl.LightningDataModule package pl Generic label for PyTorch Lightning package waiting on author Waiting on user action, correction, or update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support str(datamodule)
3 participants