Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize GridPatchDataset and PatchDataset. Potentially introduce Persistent and/or Cache IterableDatasets. #6904

Closed
ibro45 opened this issue Aug 24, 2023 · 3 comments

Comments

@ibro45
Copy link
Contributor

ibro45 commented Aug 24, 2023

Is your feature request related to a problem? Please describe.

  • GridPatchDataset's data argument must be a sequence of already loaded data. However, it would be consistent with other datasets to allow loading and manipulating other types of data through transform. For example, you might want to provide a dict of image and label paths and load them through transforms using LoadImaged().
  • PatchDataset should follow the same. Moreover, it has a transform argument that is actually the same as patch_transform in GridPatchDataset, so it should follow that naming.
  • Finally, it would be useful to have Persistent or Cache versions of these (or just any IterableDataset) - is that something that you would want to have?

Describe the solution you'd like

  • GridPatchDataset currently has patch_transform, but should have two transforms: transform and patch_transform. The transform is to be used the same as in the IterableDataset and is applied prior to patching, while patch_transform is applied to the patch.
  • PatchDataset should follow the same.

Describe alternatives you've considered
None

Additional context
I started looking into these patch-based datasets because I want to perform validation on large images during my training. Specifically, images are sometimes too big to perform SlidingWindowInferer on GPU, and using cpu_thresh just results in a very long computation. I hope to use this as a replacement for it, and it should be way faster if Persistent/Cache mode is available too. It basically emulates sliding window inference - still allows you to split into patches with some overlap and pass them through the network to get the prediction, but with a caveat that the predicted patches are never merged and instead evaluated against the corresponding label patches individually.

I am happy to make a PR for the first two points. I am also interested in implementing the third point if you agree with it and if we discuss how we should approach it.

@KumoLiu
Copy link
Contributor

KumoLiu commented Aug 25, 2023

Hi, @ibro45, thanks for your proposal!

Thanks!

@wyli
Copy link
Contributor

wyli commented Aug 26, 2023

there are multiple feature requests for caching:

I think we should add this feature @ericspod @KumoLiu @Nic-Ma

see also related report #6585

@ericspod
Copy link
Member

I think the patch dataset classes can definitely be improved though I wonder how much this is now overlapping with the WSI dataset and reader concepts. Unlike WSI the patch dataset classes load whole files and sample from them, but otherwise there's some similarity. I do see some inefficiencies such as with caching and continually indexing datasets, looking at the WSI implementation could give us some better ideas of what to do improve here. CC @drbeh

I'm adding this ticket to our backlog and I'd suggest we look at this for 1.4 as we're trying for a faster 1.3 release.

KumoLiu added a commit to KumoLiu/MONAI that referenced this issue Nov 1, 2023
Signed-off-by: KumoLiu <yunl@nvidia.com>
ericspod added a commit that referenced this issue Nov 17, 2023
Part of #6904

### Description
- Fix inefficient patching in `PatchDataset`
- Add cache option in `GridPatchDataset`

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: KumoLiu <yunl@nvidia.com>
Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
Co-authored-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@vikashg vikashg closed this as completed Dec 20, 2023
marksgraham pushed a commit to marksgraham/MONAI that referenced this issue Jan 30, 2024
Part of Project-MONAI#6904

### Description
- Fix inefficient patching in `PatchDataset`
- Add cache option in `GridPatchDataset`

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: KumoLiu <yunl@nvidia.com>
Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
Co-authored-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mark Graham <markgraham539@gmail.com>
juampatronics pushed a commit to juampatronics/MONAI that referenced this issue Mar 25, 2024
Part of Project-MONAI#6904

### Description
- Fix inefficient patching in `PatchDataset`
- Add cache option in `GridPatchDataset`

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: KumoLiu <yunl@nvidia.com>
Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
Co-authored-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Juan Pablo de la Cruz Gutiérrez <juampatronics@gmail.com>
Yu0610 pushed a commit to Yu0610/MONAI that referenced this issue Apr 11, 2024
Part of Project-MONAI#6904

### Description
- Fix inefficient patching in `PatchDataset`
- Add cache option in `GridPatchDataset`

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: KumoLiu <yunl@nvidia.com>
Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
Co-authored-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Yu0610 <612410030@alum.ccu.edu.tw>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

5 participants