-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torchvision.io.read_image does not always fail gracefully #3613
Comments
Thanks for the report! We will be looking into fixing this! |
Thanks for the information @apisutilis, I'll take a detailed look into this one! |
It seems like the error happens when the png reading function is trying to destroy the png reading structure after catching the error, that means that torchvision is catching the error, but it causes a segfault when calling
Which in turn calls https://github.com/glennrp/libpng/blob/a37d4836519517bdce6cb9d956092321eca3e73b/pngread.c#L948, where In my reproduction scenario, torchvision was able to load the image once, but the second call caused the segfault and produced the message The proposed solution involves downgrading the zlib version (which I haven't verified myself). I'll try to compile ZLib as well as libpng to see if we can get more information. |
@andfoy did you have the chance to look at this again? |
@fmassa I haven't tried to compile Zlib locally, I'll give it a go tomorrow! |
@NicolasHug yes, it would be good to have an issue to track supporting pngs with more than 8 bits. |
🐛 Bug
torchvision.io.read_image()
will sometimes segfault or abort in other uncatchable ways on malformed images, rather than failing gracefully (e.g. with aRuntimeError
).To Reproduce
Steps to reproduce the behavior:
torchvision.io.read_image
:Expected behavior
I expected that trying to read an unsupported or malformed image would instead raise a
RuntimeError
or other catchable error so that it could be handled in code, rather than aborting.Environment
PyTorch version: 1.8.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
CMake version: version 3.20.0
Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 1050
Nvidia driver version: 460.67
cuDNN version: /usr/local/cuda-10.2/lib64/libcudnn.so.7.6.4
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.20.1
[pip3] torch==1.8.1
[pip3] torchvision==0.9.1
Additional context
Something even more strange also happens with this particular image, which is that setting the
mode
toImageReadMode.RGB
will allow it to be read once, but attempting to read it a second time fails as above (i.e.torchvision.io.read_image
is not idempotent). I'm not sure if this behavior is unrelated, but whatever the root cause is, it would be nice to be able to just catch an error, e.g. to log the filename and skip the image during processing.Some quick investigation shows that the problematic images that exhibit this behavior are usually PNGs with a depth of 16 bits. OpenCV and PIL do not appear to have problems reading them.
Additionally, the error message changes sometimes, e.g. to
Segmentation fault
ordouble free or corruption (out)
.The text was updated successfully, but these errors were encountered: