-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIL version check for enum change appears to break SIMD versions #6153
Comments
The version string for Pillow-SIMD
|
We should definitely use from packaging.version import Version
v = Version("9.0.0.post1")
v > Version("9.1.0")
# False However, the main drawback with |
@vfdev-5 I suggested for the PR in timm that the user checks existence of the type
and False for 9.0.0 |
Thanks for the suggestion. I hope Pillow guys wont move it elsewhere in 10.X :) |
PyTorch Lightning also uses @rwightman checking for the existence of the value (in try-except or with |
@adamjstewart it's an extra dep though as it's not builtin, why would the warning message still exist? if the type exists you'd use the new ones and never the old... how could it warn? |
Oh true, if you use the newer one first that should be fine. Disregard what I said, both are valid approaches. |
@adamjstewart If pytorch was bringing |
@rwightman by the way any particular reasons for using Pillow-SIMD instead of tensor images and tensor data aug pipelines ? |
@vfdev-5 Pillow-SIMD was still significantly faster last time I checked (last fall I think). I'm often CPU bound in dataloader pre-processing so it has a big impact. Doing it on GPU eats precious GPU memory. |
it's often underrated how fast Pillow-SIMD actually is, when you add the proper filtering to OpenCV (to prevent aliasing on downsample), Pillow-SIMD is usually faster than cv2 as well for typical pre-proc / aug pipelines. It's just a shame it's not portable or easy to integrate into most environments... |
@rwightman what kind of ops you typically do in the data aug pipeline ? Resizing, padding, cropping, maybe some radiometric transforms (e.g. color jitter) ? |
@vfdev-5 yes, those basic ones for inference, but a wider array for train aug, solarize, posterize, other luts, rotate, skew, other geometric ... but even just the resize is a big one, Pillow-SIMD is heavily optimized to do many / most? of the ops in uint8, last I looked I think many of the tensor ops necessitated using float which is a big hit. I also try to move the GPU whenever I can before converting to float to keep bandwidth down. |
Yeah, doing a resize on uint8 would be great. Right now, here are few numbers PIL (without SIMD) vs torch interpolate CPU (multithreaded) vs torch interpolate CUDA: |
cool thanks for the numbers, I measured Pillow-SIMD to be 8x faster than Pillow (I think that was bicubic) so would maybe put it ~ 3x faster than the CPU tensor numbers |
Well, there's a room for improvements for pytorch :) |
yup! you've been making great progress, I just need ALL the CPU munch munch :) |
@vfdev-5 one last thing to keep in mind re the gap and future performance characterization, the Pillow-SIMD numbers are single threaded, I don't believe it uses any thread parallelization, so if 8 threads are pinned for those tensor numbers, that is significant... esp when you're often pinning 16-48 cores with pre-processing.... |
Yes, you are right, PIL is single threaded. Numbers for single thread are bit different. Torch interpolate with AA on CPU (3 channels float32) vs PIL RGB (uint8) are almost similar, torch is a bit slower: pytorch/pytorch#68819. So, x8 slower than PIL-SIMD according to your estimations. In the previous numbers I posted, I had to present a difference between CUDA vs CPU multithreaded. |
@rwightman Thanks for reporting. Is there any specific reason you propose checking in @vfdev-5 Thanks for the ultra fast fix. I left this comment on the PR. I've tested with SIMD, PIL 9.0 and 9.1 and it works. Let me know what you think, thanks! |
@datumbox no strong reason, hasattr works, I think in this case there wouldn't be any dir gotchas as it's a enum in a module, hasattr does have stronger guarantees re attrib resolution in all cases though |
hasattr also seems to conceptually be a better fit for checking for a single thing, as dir creates an inventory of all symbols just to check for one... |
@t-vi yeah, true, forgot that dir is list and was vars thats a dict too, so the lookup is meh. hasattr should be faster. |
🐛 Describe the bug
This change appears to break current Pillow-SIMD version #5898
Amusingly enough, I warned against this approach in a users PR in
timm
huggingface/pytorch-image-models#1256Would be nice to have it fixed before 1.12 is finalized, I just hit this trying out the RC
Versions
PT 1.12 RC, TV 0.13.0
The text was updated successfully, but these errors were encountered: