Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jpeg downscaling decoding #2076

Merged
merged 34 commits into from
Jul 17, 2022

Conversation

br3aker
Copy link
Contributor

@br3aker br3aker commented Mar 26, 2022

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

TODO

  • API
  • 1/2 scaling implementation
  • 1/4 scaling implementation
  • Fix resulting image size remainder artifacts
  • Rollback ProfilingSandbox/Program.cs

Description

This PR implements built-in scaling of the Jpeg decoder via scaled IDCT. Currently only 8-to-8 and 8-to-1 (1/8) scalings are supported, I'll implement 1/2 and 1/4 scalings as part of this PR a bit later. Current code is rather stable in terms of internal API, 1/2 and 1/4 scaling will be based on implemented abstract class.

This code is WIP, contains development leftovers.

API

I'm proposing two new methods for the IImageDecoder interface:

/// <summary>
/// Decodes the image from specified stream to an <see cref="Image"/> with specified size.
/// </summary>
/// <typeparam name="TPixel">The pixel format.</typeparam>
/// <param name="configuration">The configuration for the image.</param>
/// <param name="stream">The <see cref="Stream"/> containing image data.</param>
/// <param name="targetSize">Target size of the output image.</param>
/// <param name="resampler">Resampler used for resizing.</param>
/// <param name="cancellationToken">The token to monitor for cancellation requests.</param>
/// <returns>The <see cref="Image"/>.</returns>
Image<TPixel> DecodeInto<TPixel>(Configuration configuration, Stream stream, Size targetSize, IResampler resampler, CancellationToken cancellationToken)
    where TPixel : unmanaged, IPixel<TPixel>;

Image DecodeInto(Configuration configuration, Stream stream, Size targetSize, IResampler resampler, CancellationToken cancellationToken);

And new method to the Image<TPixel>:

// same docs as for DecodeInto above
public static Image<TPixel> LoadInto<TPixel>(Stream stream, Size targetSize, IResampler resampler, IImageDecoder decoder)
    where TPixel : unmanaged, IPixel<TPixel>
    {}

I've certainly missed other endpoints affected by proposed new API, reason for this proposal is to start a discussion :)

Why are new methods needed?
For code versatility. As ImageSharp already supports quite some image formats I'd expect this piece of code to be very popular:

using var img = Image.Load(input);
img.Mutate(ctx => ctx.Resize(thumbnailSize));
img.Save(output);

To use new Jpeg resizing API users would need to somehow identify image format first and only then call it directly parsing image stream twice. Proposed API eliminates this performance loss:

using var img = Image.LoadInto(input, thumbnailSize);
img.Save(output);

Benchmarks

Master

Method Threads Mean Error StdDev Ratio
SystemDrawing 1 3,728.3 ms 2,858.88 ms 156.70 ms 2.25
ImageSharp 1 1,654.3 ms 204.45 ms 11.21 ms 1.00
Magick 1 4,040.4 ms 210.98 ms 11.56 ms 2.44
MagicScaler 1 888.4 ms 165.17 ms 9.05 ms 0.54
SkiaBitmap 1 2,873.0 ms 135.61 ms 7.43 ms 1.74
SkiaBitmapDecodeToTargetSize 1 888.0 ms 108.37 ms 5.94 ms 0.54
NetVips 1 805.6 ms 163.64 ms 8.97 ms 0.49
SystemDrawing 8 1,246.6 ms 565.78 ms 31.01 ms 2.82
ImageSharp 8 442.8 ms 341.56 ms 18.72 ms 1.00
Magick 8 1,082.2 ms 172.97 ms 9.48 ms 2.45
MagicScaler 8 346.6 ms 164.54 ms 9.02 ms 0.78
SkiaBitmap 8 746.7 ms 176.11 ms 9.65 ms 1.69
SkiaBitmapDecodeToTargetSize 8 235.5 ms 41.48 ms 2.27 ms 0.53
NetVips 8 232.3 ms 71.85 ms 3.94 ms 0.53

This PR

Method Threads Mean Error StdDev Ratio
SystemDrawing 1 3,600.3 ms 534.99 ms 29.32 ms 3.22
ImageSharp 1 1,116.5 ms 126.91 ms 6.96 ms 1.00
Magick 1 4,034.3 ms 243.95 ms 13.37 ms 3.61
MagicScaler 1 890.1 ms 20.20 ms 1.11 ms 0.80
SkiaBitmap 1 2,912.9 ms 353.28 ms 19.36 ms 2.61
SkiaBitmapDecodeToTargetSize 1 890.6 ms 81.24 ms 4.45 ms 0.80
NetVips 1 803.1 ms 99.84 ms 5.47 ms 0.72
SystemDrawing 8 1,396.3 ms 471.21 ms 25.83 ms 4.41
ImageSharp 8 316.8 ms 32.03 ms 1.76 ms 1.00
Magick 8 1,164.4 ms 636.66 ms 34.90 ms 3.68
MagicScaler 8 356.2 ms 278.55 ms 15.27 ms 1.12
SkiaBitmap 8 732.9 ms 142.91 ms 7.83 ms 2.31
SkiaBitmapDecodeToTargetSize 8 236.7 ms 44.69 ms 2.45 ms 0.75
NetVips 8 232.9 ms 128.47 ms 7.04 ms 0.74

@br3aker br3aker changed the title Dp/jpeg downscaling decode Jpeg downscaling decoding Mar 26, 2022
@JimBobSquarePants JimBobSquarePants added this to the 3.0.0 milestone Mar 27, 2022
@codecov
Copy link

codecov bot commented Mar 29, 2022

Codecov Report

❗ No coverage uploaded for pull request base (main@2c42e41). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 75dc96a differs from pull request most recent head 7a9cf87. Consider uploading reports for the commit 7a9cf87 to get more accurate results

@@          Coverage Diff           @@
##             main   #2076   +/-   ##
======================================
  Coverage        ?     88%           
======================================
  Files           ?     997           
  Lines           ?   53874           
  Branches        ?    6891           
======================================
  Hits            ?   47444           
  Misses          ?    5250           
  Partials        ?    1180           
Flag Coverage Δ
unittests 88% <0%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c42e41...7a9cf87. Read the comment docs.

@br3aker
Copy link
Contributor Author

br3aker commented Apr 12, 2022

@JimBobSquarePants it's actually done and downscaling IDCT is covered by tests. Do we need tests with actual image comparison between decode -> resize and decode_downscale to prove correctness? It's almost 100% identical to Box resizing resizing.

The only blocking thing left is API, what's the best approach to discuss it?

@tocsoft
Copy link
Member

tocsoft commented Apr 12, 2022

I've had a little idea on how we could handle this as an api... basically i think we could handle this by convention during load.

The premise is, we add an Action<ImageProcessingContext> callback into the Image.Load APIs to allow for applying mutations during loading, as part of that we can detect if the first operation applied to the context will be to apply a ResizeProcessor if it is, and the decoder support target size decoding, then we trigger that code path and apply the remaining processors to the now decoded/resized image so that all processor will work in that context.

Limitation here would be that the resize would have to be the first operation but apart from that it would work...and I think for the fact this feel like an optimisation target its not too hard an a limitation.

This benefit of this approach is would allow us to add the feature without adding lot of new APIs (only new overloads of existing exposing more existing APIs).

Here is a spike on how it could be implemented.
br3aker/ImageSharp@dp/jpeg-downscaling-decode...tocsoft:tocsoft/pipeline-load#diff-f6a7a772c9597206b85c603ae51aa6947ba4c98dfd23441ed62e05ba8934ee1c

@tocsoft
Copy link
Member

tocsoft commented Apr 12, 2022

so instead of calling

using var img = Image.Load("/path.jpg");
img.Mutate(ctx=>ctx.Resize(120, 100));
img.Save("/out.png");

you can instead call

using var img = Image.Load("/path.jpg", ctx=>ctx.Resize(120, 100));
img.Save("/out.png");

but this wouls also work for any mutation set applied i.e.

using var img = Image.Load("/path.jpg", ctx=>ctx
  .Resize(120, 100)
  .Pixelate(...)
  .DrawText(...)
  .Resize(20, 10));
img.Save("/out.png");

which would functionally translate to

using var img = Image.Load("/path.jpg", new ResizeProcessor(120, 100));
img.Mutate(ctx=>ctx
  .Pixelate(...)
  .DrawText(...)
  .Resize(20, 10)); //notice only the first resize is striped off and in lined into loading
img.Save("/out.png");

@JimBobSquarePants
Copy link
Member

I thought we would just provide an overload for Load\LoadAsync that takes a Size.

IImageDecoder could get a default implementation for Decode with a Size if we didn't fancy implementing the approach for all the types.

Decoding to a specific size is is a very specific and unique function. I can't ever see us implementing other mutations on load so I think it's best to keep things locked down and specific. I wouldn't even allow setting a custom IResampler since we want the output to be roughly equivalent for all formats.

@br3aker
Copy link
Contributor Author

br3aker commented Apr 13, 2022

I wouldn't even allow setting a custom IResampler since we want the output to be roughly equivalent for all formats.

I don't think it's a problem.

Interface:

Image LoadResize(Stream, Size size, IResampler resampler);

Jpeg implementation:

Image LoadResize(Stream stream, Size size, IResampler resampler) 
{
    Image img;

    // jpeg idct resizing is very close to Box resampler
    if(resampler == KnownResamplers.Box)
    {
        img = this.DecodeResizeImplementation(stream, size);
    }
    else 
    {
        img = this.Decode(stream);
    }
    
    // post decode resize if decoding didn't resize due to unsupported resampler 
    // or didn't resize to exact size user asked for
    if(img.Size != size)
    {
        // Note: MagicScaler provides a way to use two different resamplers
        // For example, it can use low-quality Box resampler for rough downscaling
        // and then use something better for final downscaling to target size
        // 
        // IMO this is a very cool niche thingy but it's very specific to the jpeg decoder 
        // that it would be better to cut it or put into JpegDecoder as a specific method at max
        img.Mutate(ctx => ctx.Resize(size, resampler));
    }

    return img;
}

If any other format would provide a 'native' resizing operation - we can use this boilerplate with any other resampler check for native scaling support. And if there's no such support from the format:

Image img = this.Decode(stream);
img.Mutate(ctx => ctx.Resize(size, resampler));
return img;

The only limitation is the obligation to provide a resampler, providing anything but KnowResamplers.Box in a thumbnail scenario would cut a huge chunk of performance.

@br3aker
Copy link
Contributor Author

br3aker commented Apr 15, 2022

Soooo how do we decide?

@@ -326,11 +325,13 @@ private void ParseBaselineData()

if (this.scanComponentCount != 1)
{
this.spectralConverter.PrepareForDecoding();
Copy link
Contributor Author

@br3aker br3aker May 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we got a little problem here. ITU specs allows to place markers between frame and scan markers which can screw up current architecture.

I've decided to separate SOF scan callback which now simply saves references to frame and jpeg data variables and has a separate PrepareForDecoding() callback which is called explicitly for single scan images and implicitly called by multi-scan images by SpectralConverter. HuffmanScanDecoder and ArithmeticScanDecoder are affected by this change.

I'm planning to do a refactoring PR for entire jpeg decoder after this is merged. Arithmetic decoding PR and this scaling decoding PR contains rather large changes so it's the perfect time to clean it up! :)

@br3aker
Copy link
Contributor Author

br3aker commented May 1, 2022

@tocsoft you might be interested in this:

image

Build log: https://github.com/SixLabors/ImageSharp/runs/6247356714

@br3aker
Copy link
Contributor Author

br3aker commented May 1, 2022

Well, it's finally done and ready for final review and merging, thanks everyone for initial feedback!

Current changes are internal so users won't be able to use it.

Updated benchmarks for load-resize-save benchmark:

Method Threads Mean Ratio Allocated
SystemDrawing 1 3,626.5 ms 3.21 12 KB
ImageSharp 1 1,130.6 ms 1.00 708 KB
Magick 1 4,038.6 ms 3.57 42 KB
MagicScaler 1 884.8 ms 0.78 105 KB
SkiaBitmap 1 2,873.6 ms 2.54 42 KB
SkiaBitmapDecodeToTargetSize 1 870.2 ms 0.77 47 KB
NetVips 1 800.2 ms 0.71 42 KB
SystemDrawing 8 1,248.8 ms 3.87 15 KB
ImageSharp 8 322.6 ms 1.00 728 KB
Magick 8 1,104.8 ms 3.43 44 KB
MagicScaler 8 335.7 ms 1.04 279 KB
SkiaBitmap 8 751.7 ms 2.33 45 KB
SkiaBitmapDecodeToTargetSize 8 235.1 ms 0.73 49 KB
NetVips 8 232.2 ms 0.72 43 KB

Nothing really changed since the first benchmark for this PR. There's a lot to work on which should bring a noticeable performance boost:

  1. huffman decoder only need to parse DC value during scan decoding which would yield better performance and smaller memory buffers
  2. spectral -> color conversion doesn't have sse/avx support right now

Here's a benchmark run resulting thumbnails

@Webreaper
Copy link

You said it's internal so users can't use it. Is there a way I can pick up an alpha build and try it?

@br3aker
Copy link
Contributor Author

br3aker commented May 1, 2022

You said it's internal so users can't use it. Is there a way I can pick up an alpha build and try it?

That's what we decided above as this API is highly experimental. Personally I don't have anything against making it public if @antonfirsov and @JimBobSquarePants agree :)

@br3aker
Copy link
Contributor Author

br3aker commented May 1, 2022

Update: I've deleted some old jpeg-specific benchmarks as they are never used and we have a Load-Resize-Save across different .net libs anyway.

@br3aker
Copy link
Contributor Author

br3aker commented May 2, 2022

Played a bit with parsing only DC component without changing internal structure of how we allocate and store spectral blocks and got this beauty:

Method Threads Mean Ratio Allocated
SystemDrawing 1 3,743.0 ms 3.46 12 KB
ImageSharp 1 1,080.5 ms 1.00 707 KB
Magick 1 4,353.5 ms 4.03 42 KB
MagicScaler 1 896.3 ms 0.83 105 KB
SkiaBitmap 1 3,017.7 ms 2.79 42 KB
SkiaBitmapDecodeToTargetSize 1 898.3 ms 0.83 47 KB
NetVips 1 815.4 ms 0.75 42 KB
SystemDrawing 8 1,318.8 ms 4.40 14 KB
ImageSharp 8 300.2 ms 1.00 1,734 KB
Magick 8 1,135.5 ms 3.79 44 KB
MagicScaler 8 378.0 ms 1.26 248 KB
SkiaBitmap 8 825.6 ms 2.76 45 KB
SkiaBitmapDecodeToTargetSize 8 253.3 ms 0.84 48 KB
NetVips 8 260.4 ms 0.87 43 KB

NetVips is still very fast but we are already only 20% slower in multithreaded scenario.

@JimBobSquarePants
Copy link
Member

Very nice speed up! Can’t review til Thursday night cos on holiday but during the review process we’ll likely make the bits needed public.

@antonfirsov
Copy link
Member

antonfirsov commented May 3, 2022

What I mean is that if we manage to figure how to build a simplified, codec-invariant LoadResize API in the 3.0 timeframe (which I still believe we can do), that may impact the shape of the Jpeg-specific API in ways we cannot predict now.

I think this is very good use-case for [RequiresPreviewFeatures], so we can make the Jpeg API public for early adaptors like @Webreaper without a promise of stability.

@br3aker
Copy link
Contributor Author

br3aker commented May 3, 2022

Played a bit more with some possible optimizations, Avx spectral -> color conversion and skipping huffman values got us to almost 10% slower in multi-threaded scenarion compared to NetVips:

Method Threads Mean Ratio
ImageSharp 1 1,046.7 ms 1.00
NetVips 1 811.7 ms 0.78
ImageSharp 8 275.2 ms 1.00
NetVips 8 244.2 ms 0.89

And I don't really understand why multithreaded scenario is faster in terms of 'how close we are to the NetVips', thought it can only be the other way around :D

EDIT:

Now I got it, current benchmark setup contains progressive jpeg which NetVips doesn't handle well. With only baseline jpegs we have an expected benchmark of 25%:

Method Threads Mean Ratio
ImageSharp 1 907.7 ms 1.00
NetVips 1 688.5 ms 0.76
ImageSharp 8 300.1 ms 1.00
NetVips 8 221.3 ms 0.74

EDIT 2:

Ran a progressive-only thumbnail benchmark:

Method Threads Mean Ratio Allocated
ImageSharp 1 980.1 ms 1.00 693 KB
MagicScaler 1 1,530.6 ms 1.56 104 KB
SkiaBitmap 1 2,349.4 ms 2.40 46 KB
SkiaBitmapDecodeToTargetSize 1 854.5 ms 0.87 47 KB
NetVips 1 849.4 ms 0.87 42 KB
ImageSharp 8 312.5 ms 1.00 697 KB
MagicScaler 8 505.8 ms 1.62 186 KB
SkiaBitmap 8 639.7 ms 2.05 48 KB
SkiaBitmapDecodeToTargetSize 8 285.4 ms 0.92 48 KB
NetVips 8 311.5 ms 1.00 43 KB

Yep, that's it, we are almost the fastest .net lib in multithreaded scenario in progressive jpeg decoding and resizing. But baseline is not that cool unfortunately. It's gonna be a very tough thing to optimize.

@JimBobSquarePants
Copy link
Member

@br3aker I'll review this next.

@JimBobSquarePants
Copy link
Member

JimBobSquarePants commented Jun 25, 2022

@br3aker If I'm reading this correctly your decoder is simply decoding to the closest possible scaled size to the target yeah?

If so I'm happy to get this merged once the conflict is fixed. We can figure out naming and make things public when we figure out the public API.

I think we can provide a best of both worlds scenario btw.

  1. Global Decode with options (Decode to closest target, 2nd pass resize)
  2. JpegDecoder specific (Decode to closest target only, user decides what to do next)

@br3aker
Copy link
Contributor Author

br3aker commented Jun 25, 2022

@JimBobSquarePants if I recall correctly - yes, but I'll re-check and respond just to be sure. Will resolve conflicts tomorrow.

@br3aker
Copy link
Contributor Author

br3aker commented Jun 26, 2022

your decoder is simply decoding to the closest possible scaled size to the target yeah?

Yes.

If so I'm happy to get this merged once the conflict is fixed.

Resolved.

I think we can provide a best of both worlds scenario btw.

  1. Global Decode with options (Decode to closest target, 2nd pass resize)
  2. JpegDecoder specific (Decode to closest target only, user decides what to do next)

I like it!

@br3aker
Copy link
Contributor Author

br3aker commented Jun 27, 2022

Hmmm, I think I saw this error somewhere before, lastest commit only added comments and previous commit passed all tests. Didn't we have a net7 only bug somewhere?

image

@brianpopow
Copy link
Collaborator

Hmmm, I think I saw this error somewhere before, lastest commit only added comments and previous commit passed all tests. Didn't we have a net7 only bug somewhere?

image

Yes , that is #2117 haunting us again. I will try restart the CI.

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very excited about this! I have plans for a general API I've already started working on.

@JimBobSquarePants JimBobSquarePants merged commit e29a9e8 into SixLabors:main Jul 17, 2022
@br3aker br3aker deleted the dp/jpeg-downscaling-decode branch July 17, 2022 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants