Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Reader.read_row for decoding into caller-provided buffer. #493

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

anforowicz
Copy link
Contributor

Add Reader.read_row for decoding into caller-provided buffer.

Reader.read_row is an alternative to Reader.next_row.

  • Reader.next_row is convenient when the caller doesn't want to manage
    their own buffer, and doesn't mind having extended borrows on Reader
    to accommodate Row data.
  • OTOH the new Reader.read_row avoids an extra copy in some scanarios
    which may lead to better runtime performance - see the benchmark
    results below.

The PR adds a row-by-row/128x128-4k-idat benchmark (initially based on
Reader.next_row which requires an extra copy). Using the new
Reader.read_row results in the following performance gains in the
benchmark:

time: [-16.414% -16.196% -15.969%] (p = 0.00 < 0.05)
time: [-15.570% -15.218% -14.856%] (p = 0.00 < 0.05)
time: [-16.101% -15.864% -15.629%] (p = 0.00 < 0.05)

@anforowicz
Copy link
Contributor Author

This PR replaces #421. (This PR can be seen as a non-breaking-change alternative of the other, earlier PR.)

Copy link
Member

@HeroicKatora HeroicKatora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach much better as well.

src/decoder/mod.rs Outdated Show resolved Hide resolved
@Shnatsel
Copy link
Contributor

There doesn't seem to be any reason not to merge this. @anforowicz could you resolve the conflicts? I'll push the merge button as soon as that's done.

@anforowicz
Copy link
Contributor Author

@anforowicz could you resolve the conflicts?

Ack. I'll do that later today. My apologies for missing this message earlier.

There doesn't seem to be any reason not to merge this.

Let me share some of the lingering doubts I have regarding this PR:

  • Cons of this PR are small, but non-zero:
    • New public API (and public APIs are "forever"; or at least changing the APIs requires the friction of a new major version release and/or imposing some pain on crate users)
    • A bit of an unclear story wrt keeping 1) keeping next_row / next_interlaced_row and also read_row from this PR, vs 2) deprecating next_row / next_interlaced_row and only exposing read_row in some future major version release
  • Pros of this PR are unclear:
    • Helps to avoid an extra copy to widen the scanline in SkPngRustCodec::expandDecodedInterlacedRow, but interlaced images are relatively rare + SkPngRustCodec has other inefficiencies around processing them.
    • This PR definitely helps if the output of png doesn't need any further transformations before being usable by the crate client. But... I now think that maybe this is not the case for Skia/Chromium, which requires alpha-premultiplied output. Depending on png output:
      • Indexed: can't save the extra copy / palette expansion hop (this hop is done at the Skia/Chromium layer - see https://crbug.com/356882657)
      • Rgb, Gray or GrayAlpha: can't save the extra copy (need to transform into RGBA)
      • Rgba (~54% of images from top 500 websites):
        • Even with this PR there would still be a need for an extra transformation loop for alpha-premul (and sometimes for rgba=>bgra, or for gamma/iccp). (I have my brief notes about post-decoding transforms in a doc here.)
        • Maybe this PR can save a memcpy and then apply alpha-premul and/or rgba=>bgra transformation in-place. But this is tricky, because SkSwizzler doesn't support in-place transformations, so we'd have to switch to skcms_Transform in some cases.

`Reader.read_row` is an alternative to `Reader.next_row`.

* `Reader.next_row` is convenient when the caller doesn't want to manage
  their own buffer, and doesn't mind having extended borrows on `Reader`
  to accommodate `Row` data.
* OTOH the new `Reader.read_row` avoids an extra copy in some scanarios
  which may lead to better runtime performance - see the benchmark
  results below.

The commit results in the followin performance gains seen in the
recently introduced `row-by-row/128x128-4k-idat` benchmark:

time:   [-16.414% -16.196% -15.969%] (p = 0.00 < 0.05)
time:   [-15.570% -15.218% -14.856%] (p = 0.00 < 0.05)
time:   [-16.101% -15.864% -15.629%] (p = 0.00 < 0.05)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants