Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DALI support #608

Open
moskomule opened this issue Sep 20, 2018 · 13 comments
Open

DALI support #608

moskomule opened this issue Sep 20, 2018 · 13 comments

Comments

@moskomule
Copy link
Contributor

Hi, any plan to integrate DALI (https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/index.html) to torchvision for faster preprocessing? I found chainer tries to integrate it (chainer/chainer#5067).

@fmassa
Copy link
Member

fmassa commented Sep 21, 2018

Hi,
Thanks for opening the issue. I'll have a look at this

@moskomule
Copy link
Contributor Author

Thank you. These days I found image preprocessing parts are the bottlenecks. I'll try DALI by myself and report how it will make the processing fast.

@sotte
Copy link
Contributor

sotte commented Oct 2, 2018

albumentations is also a contender for faster image augmentation.

In my experience IO is actually worse than a "slow pre-processing" library. SSDs and NVMes(!) help a lot.

@msaroufim
Copy link
Member

Hi @datumbox it's been a while since this PR had any discussions, I'm curious if there are any plans to make this happen?

@datumbox
Copy link
Contributor

@msaroufim we are currently working to improve the Data loading process using PyTorch Data. We do not have immediate plans for integrating DALI directly at the moment but we can review this on the future. As we have very little resources, I think it's more realistic that such an investigation can happen after the release of the new Datasets API.

ccing @NicolasHug and @pmeier who lead the work on datasets.

@msaroufim
Copy link
Member

msaroufim commented Apr 25, 2022

Oh interesting so the way you'd integrate new backends in the future is to integrate them within torch.data? Also where can I learn more about the new Datasets API?

cc @VitalyFedyunin @ejguan @wenleix

@pmeier
Copy link
Collaborator

pmeier commented Apr 26, 2022

Oh interesting so the way you'd integrate new backends in the future is to integrate them within torch.data?

Not sure what you mean by "backends" here. In general you are right though. torchdata is the way to go for the new datasets.

Also where can I learn more about the new Datasets API?

There is no public document yet. However, we already have quite a large collection of datasets ported to the new structure. You can access them with torchvision.prototype.datasets.load(name), where name is the name of the dataset you want to load. For example

from torchvision.prototype import datasets

dataset = datasets.load("voc")

The dataset object is a regular IterDataPipe defined by torchdata. To transform it you can use the .map method. It takes a callable that will be executed for each sample in the dataset. This sample will be a dictionary with str keys. For example, a simple data pipeline could look like this:

from torchvision.prototype import transforms

transform = transforms.Compose(
    transforms.DecodeImage(),
    transforms.Resize(256),
    transforms.CenterCrop(256),
)

for sample in dataset.map(transform):
    ...

For everything else, please also have a look at the torchdata documentation.

@abhi-glitchhg
Copy link
Contributor

Adding to @pmeier's comment, this tutorial might help you.

@msaroufim
Copy link
Member

msaroufim commented Apr 26, 2022

@pmeier to clarify by backend I mean one of these https://github.com/pytorch/vision#image-backend - i.e: pillow, accimage, pillow simd etc..

Overall the new interface for adding datasets looks good but I'm more curious about adding new backends like DALI. In particular DALI has some accelerated image processing kernels, accelerated image decoding which I think would be very useful to integrate in vision directly, feels too domain specific to be in torch.data IMHO and is similar enough to other backends like accimage to be in vision. What's the process like for adding a new backend? If it's similar to the one for accimage https://github.com/pytorch/vision/blob/main/torchvision/transforms/functional.py#L13 I can make a PR for this

The other option is to integrate the DALI data loader as a data pipe in torch.data

Here's a good primer on DALI and its value proposition https://cceyda.github.io/blog/dali/cv/image_processing/2020/11/10/nvidia_dali.html

@VitalyFedyunin @wenleix please chime in on where you think the most natural place for a DALI integration is

@ejguan
Copy link
Contributor

ejguan commented Apr 26, 2022

The other option is to integrate the DALI data loader as a data pipe in torch.data

Thanks @msaroufim, I had the same feeling about making it as a separate DataPipe because it requires different behavior compared with datapipe.map like making sure this DataPipe only run on single process to prevent cuda context being copied around. It definitely needs more deeper look on DALI itself.

@msaroufim
Copy link
Member

Seems like there's a good workaround too NVIDIA/DALI#3081 (comment) - I'll take a more thorough look

@pmeier
Copy link
Collaborator

pmeier commented Apr 27, 2022

@msaroufim

to clarify by backend I mean one of these https://github.com/pytorch/vision#image-backend - i.e: pillow, accimage, pillow simd etc..

The new datasets will return a features.EncodedImage, which is a 1D uint8 tensor just storing the raw bytes. You can decode it however you want. Right now, transforms.DecodeImage() uses PIL as backend

class DecodeImage(Transform):
def _transform(self, input: Any, params: Dict[str, Any]) -> Any:
if isinstance(input, features.EncodedImage):
output = F.decode_image_with_pil(input)
return features.Image(output)
else:
return input

def decode_image_with_pil(encoded_image: torch.Tensor) -> torch.Tensor:
image = torch.as_tensor(np.array(PIL.Image.open(ReadOnlyTensorBuffer(encoded_image)), copy=True))
if image.ndim == 2:
image = image.unsqueeze(2)
return image.permute(2, 0, 1)

but you can use arbitrary backends there.

@abhi-glitchhg
Copy link
Contributor

Similar issue on torchdata repo - pytorch/data#761
Might be good to keep eye on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants