Optimized ImageStat.Stat.count #7599

florath · 2023-12-04T09:49:58Z

The optimized function improves the performance. The new implementation uses "sum" instead of the construct
"functools.reduce(operator.add, ...)". Tests showed that the new function is about three times faster than the original. Also it is shorter and easier to read and the dependency to functools and operator modules can be removed.

Changes proposed in this pull request:

Performance optimized ImageStat _getcount function

The new implementation uses "sum" instead of the construct "functools.reduce(operator.add, ...)". Test showed that the new function is about three times faster than the original. Also it is shorter and easier to read. Signed-off-by: Andreas Florath <andreas@florath.net>

Signed-off-by: Andreas Florath <andreas@florath.net>

florath · 2023-12-04T09:54:39Z

The setup and measurement was done in the same way as for #7593

Setting up a virtualenv with the original and optimized function side by side:

    def _getcount_orig(self):
        """Get total number of pixels in each layer"""

        v = []
        for i in range(0, len(self.h), 256):
            v.append(functools.reduce(operator.add, self.h[i : i + 256]))
        return v

    def _getcount(self):
        """Get total number of pixels in each layer"""

        return [sum(self.h[i: i + 256]) for i in range(0, len(self.h), 256)]

Run the tests on the ImageNet dataset which revealed a speed improvement of about 3. Here is the adapted script which can be run using the Pillow test images:

import pathlib
import timeit
from PIL import Image, ImageStat

IMAGEDIR="../Pillow/Tests/images"

testdir = pathlib.Path(IMAGEDIR)

NUMBER=10000
REPEAT=10

for image_file_name in testdir.rglob("*"):
    # Skip broken images
    try:
        img = Image.open(image_file_name)
        stat = ImageStat.Stat(img)
    except Exception as ex:
        continue

    # Check for correctness
    res_orig = stat._getcount_orig()
    res_opt = stat._getcount()
    assert res_orig == res_opt

    # Measure improvement factor
    exec_times_orig = timeit.repeat(
        stmt=stat._getcount_orig, repeat=REPEAT, number=NUMBER)
    exec_times_opt = timeit.repeat(
        stmt=stat._getcount, repeat=REPEAT, number=NUMBER)

    print("%10.4f - %s" % (
        min(exec_times_orig) / min(exec_times_opt), image_file_name))

A typical output (partial):

    3.2243 - ../Pillow/Tests/images/itxt_chunks.png
    4.0136 - ../Pillow/Tests/images/tiff_wrong_bits_per_sample_3.tiff
    4.2918 - ../Pillow/Tests/images/hopper.iccprofile.tif
    3.3560 - ../Pillow/Tests/images/mmap_error.bmp
    4.2634 - ../Pillow/Tests/images/hopper.dds
    3.7895 - ../Pillow/Tests/images/uncompressed_rgb.png
    2.8594 - ../Pillow/Tests/images/imagedraw_polygon_kite_L.png
    3.5236 - ../Pillow/Tests/images/colr_bungee.png
    3.2473 - ../Pillow/Tests/images/imagedraw_rounded_rectangle_corners_yyny.png
    3.3909 - ../Pillow/Tests/images/clipboard_target.png
    3.7058 - ../Pillow/Tests/images/bc5s.png
    4.2684 - ../Pillow/Tests/images/hopper.sgi
    3.9689 - ../Pillow/Tests/images/pil123rgba.qoi
    3.2465 - ../Pillow/Tests/images/imagedraw_polygon_kite_RGB.png
    3.5031 - ../Pillow/Tests/images/dispose_bgnd_transparency.gif
    3.1712 - ../Pillow/Tests/images/palette_sepia.png
    3.3890 - ../Pillow/Tests/images/imagedraw_rounded_rectangle_corners_ynyn.png
    4.0848 - ../Pillow/Tests/images/balloon.jpf
    2.5656 - ../Pillow/Tests/images/no_palette_with_transparency.gif
    2.8980 - ../Pillow/Tests/images/imagedraw2_text.png
    4.2880 - ../Pillow/Tests/images/test_anchor_multiline_mm_right.png

The first number is the speedup-factor: the factor how much faster the proposed function is measured against the original.

florath · 2023-12-04T13:10:25Z

Hmmm.
Strange failure of the codecov. All lines of the patched function are checked. Any idea?

hugovk · 2023-12-04T14:44:52Z

ImageStat.py is 100% covered: https://app.codecov.io/gh/python-pillow/Pillow/pull/7599/blob/src/PIL/ImageStat.py

Coverage percentage can decrease if you delete covered lines.

For example, imagine 3 covered out of 4 total lines = 75%.
Delete 1 covered line: 2 / 3 = 67%.

So we're fine for coverage :)

nulano · 2023-12-04T14:57:23Z

Additionally, looking at the codecov report for the PR, most of the Windows jobs are missing coverage.
Most of the missing coverage under "indirect changes" seems to be due to these missing Windows uploads.

I can see an error during the upload: https://github.com/python-pillow/Pillow/actions/runs/7084804763/job/19280550192?pr=7599#step:31:37

[2023-12-04T10:04:53.331Z] ['info'] => Project root located at: D:/a/Pillow/Pillow
[2023-12-04T10:04:53.335Z] ['info'] -> No token specified or token is empty
[2023-12-04T10:04:53.513Z] ['info'] Searching for coverage files...
[2023-12-04T10:04:53.565Z] ['info'] => Found 1 possible coverage files:
  ./coverage.xml
[2023-12-04T10:04:53.565Z] ['info'] Processing ./coverage.xml...
[2023-12-04T10:04:53.589Z] ['info'] Detected GitHub Actions as the CI provider.
[2023-12-04T10:04:54.163Z] ['info'] Pinging Codecov: https://codecov.io/upload/v4?package=github-action-3.1.4-uploader-0.7.1&token=*******&branch=ImageStat_getcount_opt&build=7084804763&build_url=https%3A%2F%2Fgithub.com%2Fpython-pillow%2FPillow%2Factions%2Fruns%2F7084804763&commit=90e1e945[30](https://github.com/python-pillow/Pillow/actions/runs/7084804763/job/19280550192?pr=7599#step:31:31)3a8a05f1dee3a6821bd40af2d2a384&job=Test+Windows&pr=7599&service=github-actions&slug=python-pillow%2FPillow&name=Windows+Python+3.12&tag=&flags=GHA_Windows&parent=
[2023-12-04T10:04:54.371Z] ['error'] There was an error running the uploader: Error uploading to [https://codecov.io:](https://codecov.io/) Error: There was an error fetching the storage URL during POST: 404 - {'detail': ErrorDetail(string='Unable to locate build via Github Actions API. Please upload with the Codecov repository upload token to resolve issue.', code='not_found')}

But it did not happen for all of the Windows jobs, pypy3.10 is fine.
The codecov report for main also looks complete.

hugovk

Thanks! Nice to drop two imports as well :)

radarhere · 2023-12-06T23:14:20Z

I've created #7605 to include this in the release notes.

florath added 4 commits December 1, 2023 18:52

Removed functools and operator import which are not needed anymore

f7d40ce

Signed-off-by: Andreas Florath <andreas@florath.net>

Added space before colon

e01354a

Signed-off-by: Andreas Florath <andreas@florath.net>

Merge branch 'python-pillow:main' into ImageStat_getcount_opt

90e1e94

radarhere added the Performance label Dec 4, 2023

homm approved these changes Dec 4, 2023

View reviewed changes

florath mentioned this pull request Dec 4, 2023

Optimize ImageStat.Stat.extrema #7593

Merged

homm mentioned this pull request Dec 4, 2023

Use list comprehensions to create transformed lists #7597

Merged

radarhere approved these changes Dec 4, 2023

View reviewed changes

hugovk approved these changes Dec 4, 2023

View reviewed changes

hugovk merged commit fe26900 into python-pillow:main Dec 4, 2023
52 of 53 checks passed

radarhere changed the title ~~Optimization of ImageStat.Stat._getcount method~~ Optimized ImageStat.Stat.count Dec 4, 2023

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 6, 2023

Added release notes for python-pillow#7599 and python-pillow#7593

df13c61

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 7, 2023

Added release notes for python-pillow#7599 and python-pillow#7593

a7f339d

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 7, 2023

Added release notes for python-pillow#7599 and python-pillow#7593

afae568

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized ImageStat.Stat.count #7599

Optimized ImageStat.Stat.count #7599

florath commented Dec 4, 2023 •

edited by radarhere

Loading

florath commented Dec 4, 2023

florath commented Dec 4, 2023

hugovk commented Dec 4, 2023

nulano commented Dec 4, 2023

hugovk left a comment

radarhere commented Dec 6, 2023

Optimized ImageStat.Stat.count #7599

Optimized ImageStat.Stat.count #7599

Conversation

florath commented Dec 4, 2023 • edited by radarhere Loading

florath commented Dec 4, 2023

florath commented Dec 4, 2023

hugovk commented Dec 4, 2023

nulano commented Dec 4, 2023

hugovk left a comment

Choose a reason for hiding this comment

radarhere commented Dec 6, 2023

florath commented Dec 4, 2023 •

edited by radarhere

Loading