-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing images on upload is too slow #1051
Comments
I'll measure what this on staging as well. |
(maybe bug isn't the right label, it's more a performance regression) |
Ah interesting! Thanks for filling this. |
@miketaylr Do you remember which format you were trying to upload? |
Also in your comment up there I see
When the current code is:
https://github.com/webcompat/webcompat.com/blob/master/webcompat/api/uploads.py#L86 which code are we talking about? |
I wonder if you had issues with PNG images This discussion is useful. Specifically the |
PNG.
The code you linked to, I had changed it to |
I think we should consider removing optimizations and move towards a solution that happens after upload (using a worker queue or similar). Having to wait 20 seconds is really, really bad UX. |
(Also, testing the |
In another thread, I think I mentioned that. save is already doing optimizations by default. What I want to understand before we back off is to really identify the source of the issue. If there is no reasonable solution, indeed we should find another way. Instead of workers, I would recommend a server side optimization of the image using a C lib for compressing images without involving at all the Web Interactions. ^_^ |
@miketaylr Do you have a test image when this happened for PNG. I want also to test if the impact is the same for JPEG (my hunch is that it is not happening for JPEG). |
Yeah, but it wasn't 15x slower. :/
The current optimizations are taking too long to be beneficial, IMO. Having to wait 20 seconds to file a desktop issue makes me think the app is broken. We can turn it off, find a better solution, and then turn it on.
I'm not sure what you mean by Web Interactions. I'm talking about using worker queues a la RabbitMQ/ https://www.digitalocean.com/community/tutorials/how-to-use-celery-with-rabbitmq-to-queue-tasks-on-an-ubuntu-vps |
(OK, let me slow down and stop writing comments that appear so rude. ❤️ ) |
The site I was testing with is: https://www.mozilla.org/en-US/firefox/46.0.1/firstrun/learnmore/ (using the webcompat reporter add-on, which takes a screenshot and submits a base64 string (which is converted to a PNG on the server side). Here's a PNG that is created from the raw base64 string, without touching Pillow (or our upload code): https://miketaylr.com/misc/original.png Here's the PNG that is created with the commented out |
WONDERFUL! Thanks Mike. |
Thanks! |
ok these are your files saved locally.
ok I created a code to save the file as you did.
I also checked we had the right images. Now let's go on profiling. I have cut the boring parts aka 0sec and replace it with "…" :)
ok a very quick thing
save_parameters creates an overhead of ~16s
It is called 83 times (not sure why) The percall column is interesting too.
Trying to run the code with only the method original_save (our base)
to compare with
The number of calls to the encoder is the same but the percall is skyrocketing. The encoder options are given in https://github.com/python-pillow/Pillow/blob/d3727b523ece4f75def72bfc960e1abd104b5dcb/PIL/PngImagePlugin.py#L667 # encoder options
if "dictionary" in im.encoderinfo:
dictionary = im.encoderinfo["dictionary"]
else:
dictionary = b""
im.encoderconfig = ("optimize" in im.encoderinfo,
im.encoderinfo.get("compress_level", -1),
im.encoderinfo.get("compress_type", -1),
dictionary) I slightly modified the code to just see what is the object before erasing it. print image_object.encoderconfig
print image_object.encoderinfo
del image_object OOOOOooooooh…
You see what I see… the parameter for optimize False is ignored. This explains that we get the same results in between
So the save_normal is increasing the size BUT the time spent on it… is a lot better. On the other we could not expect a lot better for optimization, because screenshots are most of the time, not simple logo, graphs. PNG is good at big uniform area of colors, not that much at complicated patterns. I want to test a couple more things
Stay tuned. |
Compression levels and PNGI modified the code a bit again: def optim_compress(IMAGE_PATH, IMAGE_DEST, level):
time_beg = time.time()
image_file = '%s/original-compress-l%s.png' % (IMAGE_DEST, str(level))
image_object = Image.open(IMAGE_PATH)
image_object.save(image_file, compress_level=level)
print image_object.encoderconfig
print image_object.encoderinfo
del image_object
time_end = time.time()
print 'Compress Level %s: %s' % (str(level), time_end - time_beg) and
These are the results with the first one being the normal save. First the files
Then the execution time.
|
The original file size is
If we do not compress at all, we are very fast, but the file is huge. 0.51s for ❗️For PNG files, I guess we should completely avoid any kind of optimization and we should even avoid the Pillow library altogether. |
JPEG testFor the JPEG test, I will use The objective is to understand if optimize has any effect on time and/or size. Size results
Good Stuff already, we gain weight.
Yeah! Clap clap! The time overhead is insignificant. I checked the quality difference. Nothing visibly damaging. So let's keep the code for JPEG files at least. |
PNG to JPEGMy last test or trick ;) as you prefer. Let's take @miketaylr's image and convert it to JPEG.
Wooot! ;) Very good for size. What about time?
Very Nice. ok! ok! But quality must be crappy. Let's see. Original. Just save. Optimize true |
I don't know for you… but apart of Animated GIF which needs to be directly saved without Pillow. I put the full code of my test there. Not compact on purpose. |
>>> file_dest = 'blahblah/foobar.png'
>>> 'jpg'.join(file_dest.rsplit('png',1))
'blahblah/foobar.jpg'
>>> file_dest = 'blahblah/foobar.jpg'
>>> 'jpg'.join(file_dest.rsplit('png',1))
'blahblah/foobar.jpg' |
Fixes #1051. Converting PNGs to JPEG
Yep, I agree. I'll file a bug to save animated GIF in a non-Pillow path. Thanks so much @karlcow! |
😍 These findings are so awesome @karlcow. The difference in size and speed for JPEG is sort of mind-boggling. |
@miketaylr Thanks. My naive interpretation is that PNG decoder needs to read the full picture pixel by pixel (easy), but the PNG encoder needs to rearrange all these pixels by linear family (aka sorting before, reordering) before rewriting. And this can take a long time. PNG is a losless format. |
Pillow's optimization is slow. :(
I need to measure more but right now local full-size screenshots are taking forever:
save function took 20237.716 ms
Commenting out the following block in upload.py:
save function took 1704.766 ms
cc @karlcow
The text was updated successfully, but these errors were encountered: