-
-
Notifications
You must be signed in to change notification settings - Fork 16.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
premature end of JPEG images #916
Comments
This is caused by a corrupted image. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@glenn-jocher If the error occurs in the beginning of training and shows "Premature end of JPEG file", Is the error due to the corrupted image? |
@seekFire its not an error is a message, its self descriptive. |
Dear @glenn-jocher, |
@jaqub-manuel this is a very low level C++ warning in the cv2 image loader I think. It does not produce an error and is not possible to tag these as corrupted in any way that I know currently. Stackoverflow has a few conversations on the topic. The result is an image will only partially load, the rest of the area will be black. 1 or 2 images with this problem should not harm your dataset. |
Many Thanks for clarification... |
@jacklinquan @glenn-jocher
Does it mean only for these many files? |
@sramakrishnan247 it looks like 8 of your files have 'premature end of JPEG'. This is a low level warning, and is not caught by python asserts or cv2 loading errors, so these files will all be used for training. |
@glenn-jocher |
Can anyone please let us know how to find which images have this problem or how can we fix these images? |
@madr3z this is a low level warning, and is not caught by python asserts or cv2 loading errors, so these files will all be used for training. There is currently no way to identify them, though you could always debug this by printing each image name as it's cached and observing which coincides with the messages. |
it may occur when not downloads the complete image file.
you can try this. |
@xiaowk5516 that's an interesting piece of code! We may be able to integrate this into the dataset checks if the speed is fast and it works as intended. The correct location for this would be here: Lines 1054 to 1061 in 65f81bf
|
@xiaowk5516 I think the following image scanning code should work based on your idea. Can you submit a PR to help integrate this code into master to help everyone with this problem? # verify images
im = Image.open(im_file)
im.verify() # PIL verify
shape = exif_size(im) # image size
assert (shape[0] > 9) & (shape[1] > 9), f'image size {shape} <10 pixels'
assert im.format.lower() in img_formats, f'invalid image format {im.format}'
if im.format.lower() in ('jpg', 'jpeg'):
with open(im_file, 'rb') as f:
f.seek(-2, 2)
assert f.read() == b'\xff\xd9', 'corrupted JPEG' |
Of course! I will submit it soon. |
@xiaowk5516 great! |
@ImsuperSH @seekFire @sramakrishnan247 @jacklinquan @madr3z good news 😃! Your original issue may now be fixed ✅ in PR #3638. This PR adds JPEG corruption error checking by @xiaowk5516 to the YOLOv5 train and testloaders. To receive this update:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀! |
Hello, this might be a little late but I found a solution to fixing premature ending error. I leave this here in case anyone needs it in the future. In short, using opencv to read the image and then save it using opencv will fix the image and add the EOI code 'D9' in the end of the hex file. |
@Poulinakis-Konstantinos thanks for the idea! Do you know if PIL Image saving also resolves the issue? The reason I ask is the images are already opened with PIL as Lines 866 to 876 in 2da6444
|
@glenn-jocher I just tested it with PIL. Yes, saving the image with PIL does restore the image's EOI mark ! Adding a save command in case a corrupted image is detected would probably be beneficial . |
@Poulinakis-Konstantinos hmm interesting. Ok, we need to be very careful about saving the images as PIL includes a default compression level, cv2 I'm not sure, but we want to make sure the new JPG pixel values are not altered in any way. If we can get some corrupted images to pass an |
Maybe something like this: im.save(im_file, format='JPEG', subsampling=0, quality=100) |
@Poulinakis-Konstantinos I can't figure out how to save a JPG without altering it. I created a script here that shows significant differences in pixel values on both cv2 and PIL saving. Do you have any ideas? import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
fp = '000000000034.jpg' # original image file path
fp_pil = fp + '.PIL.jpg'
fp_cv2 = fp + '.cv2.jpg'
# Read and write cv2 and PIL JPGs
Image.open(fp).save(fp_pil)
cv2.imwrite(fp_cv2, cv2.imread(fp))
# Read new JPGs and compare
im = cv2.imread(fp)
im_pil = cv2.imread(fp_pil)
im_cv2 = cv2.imread(fp_cv2)
dp = (im - im_pil).ravel()
dc = (im - im_cv2).ravel()
print(np.allclose(im, im_pil))
print(np.allclose(im, im_cv2))
# Plot
fig, ax = plt.subplots(1, 2, figsize=(8, 4), tight_layout=True)
ax[0].hist(dp, 255)
ax[1].hist(dc, 255)
plt.savefig('results.jpg') |
@Poulinakis-Konstantinos I've opened a PR with a fix in #4548. Can you review please? |
@glenn-jocher jpg and jpeg is lossy compression for digital images. that is, its compression is irreversible, and the pixel value of the image obtained by decompression and recompression will be different. |
@ImsuperSH @Poulinakis-Konstantinos @seekFire @jaqub-manuel @xiaowk5516 good news 😃! Your original issue may now be fixed ✅ in PR #4548. This PR automatically restores and saves corrupted JPEGs before training starts, and all images are now used for training, including the restored JPEGs. To receive this update:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀! |
❔Question
Epoch gpu_mem GIoU obj cls total targets img_size 1/99 2.87G 0.05456 0.04197 0 0.09652 10 640: 100% 157/157 [00:52<00:00, 2.98it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 0% 0/157 [00:00<?, ?it/s]Premature end of JPEG file Class Images Targets P R mAP@.5 mAP@.5:.95: 100% 157/157 [00:19<00:00, 8.21it/s] all 2.5e+03 1e+04 0.362 0.777 0.684 0.338
It shows premature end of JPEG images during validation, what leads to this?
Additional context
The text was updated successfully, but these errors were encountered: