Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text segment ReadOnlySpan<byte> initialization #1133

Merged
merged 21 commits into from
Mar 6, 2020

Conversation

Sergio0694
Copy link
Member

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR includes a couple of minor optimizations:

  • Replaced some static readonly byte[] arrays with static ReadOnlySpan<byte> values that directly map over constant data in the .text segment (see here).
  • Removed some bound checks when accessing those static sections (added some DEBUG checks)
  • Removed some type initialization calls to Encoding.ASCII.GetBytes, no longer necessary
  • Switched the Stream.Write calls to the ReadOnlySpan<byte> overloads.

No rush at all for this, just saw those byte[] arrays and thought I'd have some fun optimizing them away. Pinging @antonfirsov for the reviews as he's the official performance guru. 😄

@Sergio0694 Sergio0694 self-assigned this Feb 27, 2020
@Sergio0694 Sergio0694 added this to the 1.0.0 milestone Feb 27, 2020
Copy link
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's prove that the new unsafe code does not open a security hole! Otherwise looks good.

Comment on lines 366 to 376
DebugGuard.MustBeLessThan(toReverse & 0xF, Bit4Reverse.Length, nameof(toReverse));
DebugGuard.MustBeLessThan((toReverse >> 4) & 0xF, Bit4Reverse.Length, nameof(toReverse));
DebugGuard.MustBeLessThan((toReverse >> 8) & 0xF, Bit4Reverse.Length, nameof(toReverse));
DebugGuard.MustBeLessThan(toReverse >> 12, Bit4Reverse.Length, nameof(toReverse));

ref byte bit4ReverseRef = ref MemoryMarshal.GetReference(Bit4Reverse);

return (short)(Unsafe.Add(ref bit4ReverseRef, toReverse & 0xF) << 12
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 4) & 0xF) << 8
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 8) & 0xF) << 4
| Unsafe.Add(ref bit4ReverseRef, toReverse >> 12));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DebugGuard won't save us in production. Where does toReverse come from? Can a malicious input result in buffer overflow?

Copy link
Member Author

@Sergio0694 Sergio0694 Feb 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I mostly added those to validate the code path in our CI tests. I had not considered possible attempts to exploit this from a security standpoint, you're right 😅

What if we figured out the upper/lower bounds of those 4 indices, and validated just those? It would still result in just 2 conditional jumps, instead of 4 (one for each access).

Actually, now that I think about it:

  1. The first three indices are always computed with a final & 0xF, which means the maximum possible value is 0xF, so 15, which is always in range. So those first 3 accesses should always be necessarily valid, without the need to test them
  2. That last >> 12 could result in an out of bound index, either if the value is negative and below -1 << 12, or if it's a positive value greater than 1 << 12. I guess we could just check this last one just to be extra sure, as I don't know where exactly the input is coming from here, as you mentioned.

That's still just a single branch compared to 4, so... Yay? 😁
What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, let's do it like that!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, done in 0577690. Also added a detailed comment for future reference 😊

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment on BitLengthOrder also applies here. Additionally, if we already on the microoptimization road, we should cache the value of toReverse >> 12 to a local, since we are using it twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4b0dfd1.

Also I should really stress out how if we care about micro-optimizations, we should really consider refactoring our Guard APIs with that trick I suggested. Look at all that asm clutter due to that string.Format call we have here:

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8B6YgAgGEIAHATygEsBzACwzIAowBKMgZQYIyAGWzBouMtgB2AEzKRpGRsACuGCQDoAsAChSIhmBjTcMearkwoZDCxhkAgtWxh7h46ZhoyANWu4DBDSZABMmgAMOrp6AAKhAIxxEWSxCZoAIgzYTNIQuBhGuADcyanpWTl5BUWaVLIwjtLYADa0gSVlaZoASpaF+DB1EPjUDC3WfNYAbkYwnbqxKd19SgyDmgCSStY0U1CzxgtxAMypoWQZMABmLdgY1gASqtfX+DJ6AN56ZL9k1IxpvcHGkkGQejBsLIAPLSNp8FzSAA8wFoDwAfGQAEIMDAoCHTAIOAC8mOkMAA7mRUQ8ANoAXTInzIER8AA4fOgyAlQj5edzWWQkD4ElyEj4AJw+ACsIpOPnl3PFZAA7CLpWQAL6lXR/Mg/P6kchI3CqfDvKC0dEG35G8EwQlQMy2dzAXGSCDXKTcsFu9hAlqqIY2sh2pHEU3m7CW626w0GJEuKDYfBkZqDYkAIg0BKJmfRvlaQdsEDIsEdzr9uHDSZTsb1YdiKtwBaLDgpuJY1PdZYdRNk4abLZDtIAsjA7BBZJtRi0OOPJ9PZ9DqIVgrhNI4mExYLhAoTti0GNJj0xuPSQ7EzqCyLgWNB2DiMLmnTAOMf2Dm+6/uCHvnG9V+D8SxfMwemYNg+BYBhrgwLFaB5MhiRA79nXRTEeR1QC/gAcVUaNZE0UdVAKLEYGEeZcAAFRYGRoSgABRABHfCWioiAkVUD90Q4DguKUbgvwrGBwNYDAoJguCEP5BJZTTFMYE9DghKJbhuB1EM9VgL0aQcP18VQkSbiQ3svXHfBoFoUdozvVpNBwicIWuawTGMDgnwM4T1L0TS/ibTg7wfbgOAAVVMbBnM3WRZA4bTuzxUCjOuHwVNfMgADIWQQAAxXgkSRblQl87CSr1QgyDC3AIqGRxoti4z9MSpyfGUiBErIdCyBQXhMoiHK8oKtlitK0rysq6qopiuLGsM5rOFStDMTZHqstysh8q64aRuwsbwsi2qpoa3FPKJOaFpEiDxOg2D4J5NSsL+TU9CemJ9ATSMLStPQ7QABSgCBZgaSRBkXSQNH+f6HjAdgcmwY8CjIY8AwYeRaxBgJojDCMzU+2NaSuNRt0mB5qBo/7VFYC9dFpBiEDAQMGmy/78HqGAqEdHIYCpj9rGaFpyjBJNClac4yDwgivhDMMPujL6ANDAx/EYa4GHmF17hdBxcGoGAwBg1X5ADYsGEkCY93VkJoDIGAWJF8HsDId4EHWM0yCNmApYMGR5Dsf6KUkGRrbpmBVyCEIYMR9gTbTCAMGiBsE2xqMY09400fkjNM3d/MaIcDBoyYCc3bbHwKWgtxb3vVQWnkYAHGR2RgUImtozrVP1vT9MYCzJ2c/cJ2XdTd3NBb5N8HreM09b1Mu6zNGJ2sAA5BS+4cLuyE9TX/mnhebDsDXo/BuuFHsMAAGtzBH4g0Yn20EwwWgdc7hSsyowtAxgVfbEfhxN7seu2xXwfk/aet8FbGhgMHUOwQFDaSzI4KATAzQmAwLTYw0DpD5nbomaecVZ5ZzbJmYgmJo47khA8PetEQj/0dtgZ2+BXbD2wcQSB6C1zSDAWOCc94lzUDnAuHhM4+ErnYRuLcZC9wMAPHCY8p5zyXmvAkMEsQUBkGIqRcilEaJ0UYrbNiHE35th4oYj+xcP4+BMcWJ2Pg0gpHnhQ5egxfzyz1GXFyZBLEOBAGQTYVBRit2ABMJEniwH/lKhHDgw8/FJhgOxDgTteCYgiM40qYTtoeJYH7BBSDBhKGhOoaE1wegyELmgkO7COD2KXgpHwAASTM79iyfHdpqR2JF2DHzNmDKhG8bA21YiWJkTtNSaEzN5Fxj0QwvRHAIqcQj+HcLmcuDBG5F4QEPLI6QZ4qZ6gBFI4EAtUiqLJhACk2TkF5IKUUkpMAykYI4LY7eY9d6OO8OUFIgw9yc14KSYavtTlpkpE4RBFyMD5LBdcrZtyoEVKqVAV5PhPlVULuMl6QA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is just terrible, wish we have noticed that earlier. Guard.MustBeLessThanOrEqualTo will make this method actually slower than it was before your changes. Can you just use a manual check + ThrowHelper here?

If you have some time, feel free to file a PR in SixLabors/SharedInfrastructure. For the comparison methods I'd rather see a set of duplicate non-generic methods, than TValue : IComparable<TValue>, which also brings some extra overhead I believe.

@@ -40,8 +40,6 @@ internal sealed unsafe class DeflaterHuffman : IDisposable
// probability, to avoid transmitting the lengths for unused bit length codes.
private static readonly int[] BitLengthOrder = { 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15 };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might well be better off using the ReadOnlySpan<byte> here also and casting.

I know we do that in the jpeg encoder for uint and BitCountLut with, as I recall, a net gain in performance.

Copy link
Member Author

@Sergio0694 Sergio0694 Feb 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 5b7ac75.
Should I add some bounds check here, or are all those accesses guaranteed to be safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't tell you for certain I'm afraid. That port was an act of desperation over understanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Should I just switch those accesses to a direct Span<T>[int] access then? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying now, yeah 👍
I'm not sure I understand where you're seeing that with Bit4Reverse though.
Here's where that is used:

int toReverseRightShiftBy12 = toReverse >> 12;
Guard.MustBeLessThanOrEqualTo<uint>((uint)toReverseRightShiftBy12, 15, nameof(toReverse));
ref byte bit4ReverseRef = ref MemoryMarshal.GetReference(Bit4Reverse);
return (short)(Unsafe.Add(ref bit4ReverseRef, toReverse & 0xF) << 12
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 4) & 0xF) << 8
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 8) & 0xF) << 4
| Unsafe.Add(ref bit4ReverseRef, toReverseRightShiftBy12));

All the indexing values there are int already 🤔
I mean, those are actually the same indices that were being used before as well. I'm not sure I'm following here, where is the byte offsetting peformed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh sorry, I was wrong the values for Bit4Reverse are not used for indexing into data. This is the data that gets indexed.

We should deal with Guard however to make sure we don't regress. Do you plan to fix it globally in SixLabors/SharedInfrastructure, or will you replace it in this PR with a manual check + ThrowHelper calls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahahah no problem, glad we could figure this out, I was wondering what was I missing there 😄

As for Guard, I'd say that's an unrelated improvement, since it's not something that will impact the functionality in any way, it's just a matter of further optimizations. I guess the best thing would be to just make another PR and fix the entire Guard class, with all the included APIs. Sounds good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can do it, that would be the best!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, always happy to bring more of my discoveries and optimizations to ImageSharp! 😄

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sergio0694
Copy link
Member Author

@JimBobSquarePants Whoops, fixed in 68387a7, my bad 😅

@codecov
Copy link

codecov bot commented Feb 28, 2020

Codecov Report

Merging #1133 into master will decrease coverage by <.01%.
The diff coverage is 98.63%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1133      +/-   ##
==========================================
- Coverage   82.24%   82.23%   -0.01%     
==========================================
  Files         678      678              
  Lines       29164    29192      +28     
  Branches     3284     3284              
==========================================
+ Hits        23986    24007      +21     
- Misses       4481     4488       +7     
  Partials      697      697
Flag Coverage Δ
#unittests 82.23% <98.63%> (-0.01%) ⬇️
Impacted Files Coverage Δ
...ats/PixelBlenders/PorterDuffFunctions.Generated.cs 11.73% <ø> (ø) ⬆️
...c/ImageSharp/Common/Extensions/StreamExtensions.cs 93.33% <ø> (ø) ⬆️
src/ImageSharp/Formats/Png/PngConstants.cs 100% <100%> (ø) ⬆️
...rc/ImageSharp/Metadata/Profiles/Exif/ExifWriter.cs 80% <100%> (ø) ⬆️
src/ImageSharp/Formats/Png/PngEncoderCore.cs 97.63% <100%> (ø) ⬆️
src/ImageSharp/Formats/Jpeg/Components/ZigZag.cs 100% <100%> (ø) ⬆️
...ssors/Quantization/OctreeFrameQuantizer{TPixel}.cs 95.56% <100%> (+0.02%) ⬆️
...Sections/GifNetscapeLoopingApplicationExtension.cs 100% <100%> (ø) ⬆️
src/ImageSharp/Formats/Gif/GifEncoderCore.cs 94.28% <100%> (ø) ⬆️
src/ImageSharp/Formats/Gif/GifConstants.cs 100% <100%> (ø) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9e10e2...20bf5fa. Read the comment docs.

@JimBobSquarePants
Copy link
Member

@Sergio0694 @antonfirsov Are we waiting for this to be completed and merged before finishing here?

SixLabors/SharedInfrastructure#7

@Sergio0694
Copy link
Member Author

@JimBobSquarePants Not sure whether we necessarily have to wait for that. I mean, having that would reduce the overhead in those checks, but it wouldn't actually change anything semantically. We might want to wait for it though in case the changes in this PR alone have too much of a performance regression, but that shouldn't be the case (plus it'd be temporary anyways).

Up to you guys 👍

@antonfirsov
Copy link
Member

@JimBobSquarePants I expect the aggregate result of the changes in DeflaterHuffman.BitReverse() to be a regression, unless we finish SixLabors/SharedInfrastructure#7 or replace Guard with verbose local code.

On the other hand, I'm fine fixing the regression in a follow-up PR.

@JimBobSquarePants
Copy link
Member

@Sergio0694 I found the AccessViolation.

https://github.com/SixLabors/ImageSharp/pull/1133/checks?check_run_id=484873388#step:11:698

It's not us it's the ImageMagick reference codec. @dlemstra You'll want to see this.

@Sergio0694
Copy link
Member Author

@JimBobSquarePants That's awesome, great work! Props to your investigative skills 😄
Also glad to hear it's not actually coming from ImageSharp, that's perfect.

@JimBobSquarePants
Copy link
Member

JimBobSquarePants commented Mar 6, 2020

Ok.... The submodule is now updated and I've added updated relevant tests.

I've also configured the solution to build the T4 templates on build since we don't import the transformed file.

Will merge once this finishes building.

Spoke too soon... The templates don't automatically run on Unix of course because there's no Visual Studio. Will have another think.

Update:
I've added a build target to copy the generated file from the submodule to the main solution to ensure we capture it.

@JimBobSquarePants JimBobSquarePants merged commit a49ebb0 into master Mar 6, 2020
@JimBobSquarePants JimBobSquarePants deleted the sp/text-segment-initialization branch March 6, 2020 10:47
@@ -90,7 +90,7 @@ public void AddNewFrame_Frame_FramesNotBeNull()
this.Collection.AddFrame(null);
});

Assert.StartsWith("Value cannot be null.", ex.Message);
Assert.StartsWith("Parameter \"source\" must be not null.", ex.Message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also do Assert.Throws<ArgumentNullException>("source", () => here instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was dealt with above?

@JimBobSquarePants JimBobSquarePants modified the milestones: 1.0.0, 1.0.0-rc1 Apr 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants