Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Make CuImage pickleable #326

Open
luzuku opened this issue Jul 4, 2022 · 9 comments
Open

[FEA] Make CuImage pickleable #326

luzuku opened this issue Jul 4, 2022 · 9 comments
Assignees
Labels
feature request New feature or request

Comments

@luzuku
Copy link

luzuku commented Jul 4, 2022

Is your feature request related to a problem? Please describe.
CuImage is currently incompatible with PyTorch's builtin mulitprocessing (specifically with DistributedDataParallel) because it cannot be pickled.

Describe the solution you'd like
I'd like CuImage to be pickleable out of the box. Do you think this is possible?

Describe alternatives you've considered
The only real alternative is to use PyTorch Lightning's own implementation of DDP which does not have this pickle requirement. In my case this would mean a complete re-implementation of foreign code where a lot could go wrong and I'd rather avoid that if possible.

@luzuku luzuku added the feature request New feature or request label Jul 4, 2022
@luzuku luzuku changed the title [FEA] Make CuImage pickable [FEA] Make CuImage pickleable Jul 4, 2022
@gigony
Copy link
Contributor

gigony commented Jul 6, 2022

Hi @luzuku,
Thanks for your interest and question!

CuImage object acts for multiple purposes:

  1. A data loader object that holds internal objects such as file name to load and internal object for TIFF loader which is specific to the system.
  2. An Array data object that holds image data loaded by read_region() method.

In the following example, img variable is for 1) and patch variable is for 2).
Calling read_region() method on patch would not load from the file, it just returns a cropped image from the image data in patch.
(variable patch2)

It is possible for CuImage to support pickle by serializing/deserializing metadata and image data only (case 2: variable patch or patch2)).

However, if you don't need such metadata, I think it is better to convert a CuImage object to a Numpy/CuPy array and pass it to DDP as shown in the following example.
(Converting CuImage data in CPU (by default) to Numpy array wouldn't cause copying memory so it is a zero-copy and just passing a pointer to the memory block and array information).

Could you please let us know if converting to Numpy/CuPy would solve the issue, or please give us a minimal concrete example or a detailed use case that prevents you from using DDP with cuCIM. I would like to know that, if we want to support pickle for CuImage class, derializing/deserializing image data is enough or if we need more than that (e.g, whether recreating a CuImage object with TIFF loader object using the provided file path in a distributed system is needed or not).

Thank you!

from cucim import CuImage
import cupy as cp
import numpy as np
import pickle

img: CuImage = CuImage("./notebooks/input/image.tif")
patch: CuImage = img.read_region((0,0), (256,256))
    
patch2 = patch.read_region((0,0), (128, 128))

patch_arr1 = np.asarray(patch)
patch_arr2 = cp.asarray(patch)

with open('test.pickle', 'wb') as out_file:
  pickle.dump(patch_arr1, out_file) # patch_arr2

print(patch.shape)
print(patch2.shape)
# [256, 256, 3]
# [128, 128, 3]
>>> patch2.metadata
{'cucim': {'associated_images': [],
  'channel_names': ['R', 'G', 'B'],
  'coord_sys': 'LPS',
  'dims': 'YXC',
  'direction': [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]],
  'dtype': {'bits': 8, 'code': 1, 'lanes': 1},
  'ndim': 3,
  'origin': [0.0, 0.0, 0.0],
  'path': './notebooks/input/image.tif',
  'resolutions': {'level_count': 1,
   'level_dimensions': [[128, 128]],
   'level_downsamples': [1.0],
   'level_tile_sizes': [[128, 128]]},
  'shape': [128, 128, 3],
  'spacing': [1.0, 1.0, 1.0],
  'spacing_units': ['micrometer', 'micrometer', 'color'],
  'typestr': '|u1'}}

@gigony gigony self-assigned this Jul 6, 2022
@luzuku
Copy link
Author

luzuku commented Jul 7, 2022

Hi @gigony, thanks for the quick reply!

My use case is basically using cuCIM/CuImage to generate random patches on the fly from ~10.000 TIFF files.

Usually one would pre-generate those patches, save them as jpeg and then just have a list of those to loop over for training.
I want to skip this entirely and have a dataset like this:

class Slide:
    def __init__(self, path, region_size):
        self.image = cucim.CuImage(path)
        self.regions = self.find_all_viable_regions(region_size)
        # self.regions = [
        #    [location, size],
        #    ...
        # ]
        
    def read_region(self, *args, **kwargs):
        return np.asarray(self.image.read_region(*args, **kwargs))
        

class SlideDataset(torch.utils.data.Dataset):
    def __init__(self, slide_paths, region_size):
        self.slides = [Slide(path, region_size) for path in slide_paths]
        self.samples = self.make_samples(...)
        # self.samples = [
        #    [slide_idx, region_idx],
        #    ...
        # ]
        
    def __len__():
        return len(self.samples)
        
    def __getitem__(self, idx):
        slide_idx, region_idx = self.samples[idx]
        slide = self.slides[slide_idx]
        location, size = slide.regions[region_idx]
        patch = slide.read_region(location, size)
        return patch

When trying to use this with DDP I just get TypeError: cannot pickle 'cucim.clara._cucim.CuImage' object.

If you need more detail about the use case just let me know!

@jakirkham
Copy link
Member

IDK if you have already seen this blogpost, but maybe that would be helpful here?

@luzuku
Copy link
Author

luzuku commented Jul 8, 2022

Thanks @jakirkham! I will have a look at dask_image. But as you seem to be one of the main authors, you might already know the answers to my questions:

  1. Can it handle .svs files with any encoding?
  2. Is the JPEG2000 decoder as fast as the one from cuCIM? Because this is my biggest bottleneck.

@jakirkham
Copy link
Member

jakirkham commented Jul 8, 2022

Ah actually wasn't too focused on using dask-image for this specifically (though that could be one path forward) and instead borrowing from the technique in the blogpost to simplify management of all of these TIFF files. It could be used with cuCIM for example. Know folks who have used this strategy in other fields outside of microscopy as well

@luzuku
Copy link
Author

luzuku commented Jul 8, 2022

Interesting, thanks for your input! I'll have to familiarize myself with dask then and see how it could possibly improve/speed up my approach. But so far I was actually pretty content with my naive handling and really just a little bit unsatisfied with the pickle situation. I found a way to work around it in my case with LightningLite which supports DDP without the pickle requirement, but it feels messy and if it is not too much work to make CuImage pickleable, which I can't assess really, then why not do it? 🤷

@gigony
Copy link
Contributor

gigony commented Jul 8, 2022

Thanks @luzuku for sharing the example! It's a clever approach (loading patches directly, instead of saving them into the file system with an intermediate file format) and that use case is what cuCIM can help!

It seems that you want to keep CuImage object that holds the file path information and use the object to load partial images (patches) of the particular Whole Slide Image.

Since DDP needs to send objects to other nodes by serialization/deserialization, in the object to send, we cannot initialize/store data that cannot pickle. Instead, we can either

  1. store pre-processed data that are pickable
  2. store parameters for the data and do a lazy initialization when used from the distributed nodes.

Since 1) cannot be applied to your use case (data size is large), we can try the second approach and the following change would be what you can try:

class Slide:
    def __init__(self, path, region_size):
-        self.image = cucim.CuImage(path)
+        self.path = path   
+        self.image = None
        self.regions = self.find_all_viable_regions(region_size)
        # self.regions = [
        #    [location, size],
        #    ...
        # ]
    def read_region(self, *args, **kwargs):
        return np.asarray(self.image.read_region(*args, **kwargs))


class SlideDataset(torch.utils.data.Dataset):
    def __init__(self, slide_paths, region_size):
        self.slides = [Slide(path, region_size) for path in slide_paths]
        self.samples = self.make_samples(...)
        # self.samples = [
        #    [slide_idx, region_idx],
        #    ...
        # ]

    def __len__():
        return len(self.samples)

    def __getitem__(self, idx):
        slide_idx, region_idx = self.samples[idx]
        slide = self.slides[slide_idx]
+        if slide.image is None:
+            slide.image = cucim.CuImage(slide.path)
        location, size = slide.regions[region_idx]
        patch = slide.read_region(location, size)
        return patch

This blog post(https://junyonglee.me/research/pytorch/How-to-use-LMDB-with-PyTorch-DataLoader) helped me suggest the solution above.

Please let me know if the above approach would work.

@luzuku
Copy link
Author

luzuku commented Jul 11, 2022

Lazy-loading is such a simple and great idea! The overhead is also minimal. Thanks, I will try it out :) Should I close this issue then or do you still think it is worth your while to make CuImage pickleable?

@jakirkham
Copy link
Member

It might also be worth checking out the h5pickle project, which has some cleverness around minimizing the number of file handles that are open at a time.


Pickling CuImage objects could make sense and understand this could be useful in a variety of scenarios. Though we would need to think about it a bit. A few tricky points that come to mind are

  1. Pickles can be saved to disk (what does this mean especially if the original file is removed).
  2. Without a shared filesystem, pickling may not work between workers (the file may be missing).

We would need to figure out how we want to address those issues. Should add this isn't unique to cuCIM. Other file libraries grapple with similar issues.

@caryr35 caryr35 added this to cucim Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: No status
Development

No branches or pull requests

3 participants