-
-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text segment ReadOnlySpan<byte> initialization #1133
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's prove that the new unsafe code does not open a security hole! Otherwise looks good.
DebugGuard.MustBeLessThan(toReverse & 0xF, Bit4Reverse.Length, nameof(toReverse)); | ||
DebugGuard.MustBeLessThan((toReverse >> 4) & 0xF, Bit4Reverse.Length, nameof(toReverse)); | ||
DebugGuard.MustBeLessThan((toReverse >> 8) & 0xF, Bit4Reverse.Length, nameof(toReverse)); | ||
DebugGuard.MustBeLessThan(toReverse >> 12, Bit4Reverse.Length, nameof(toReverse)); | ||
|
||
ref byte bit4ReverseRef = ref MemoryMarshal.GetReference(Bit4Reverse); | ||
|
||
return (short)(Unsafe.Add(ref bit4ReverseRef, toReverse & 0xF) << 12 | ||
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 4) & 0xF) << 8 | ||
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 8) & 0xF) << 4 | ||
| Unsafe.Add(ref bit4ReverseRef, toReverse >> 12)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DebugGuard won't save us in production. Where does toReverse
come from? Can a malicious input result in buffer overflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I mostly added those to validate the code path in our CI tests. I had not considered possible attempts to exploit this from a security standpoint, you're right 😅
What if we figured out the upper/lower bounds of those 4 indices, and validated just those? It would still result in just 2 conditional jumps, instead of 4 (one for each access).
Actually, now that I think about it:
- The first three indices are always computed with a final
& 0xF
, which means the maximum possible value is0xF
, so15
, which is always in range. So those first 3 accesses should always be necessarily valid, without the need to test them - That last
>> 12
could result in an out of bound index, either if the value is negative and below-1 << 12
, or if it's a positive value greater than1 << 12
. I guess we could just check this last one just to be extra sure, as I don't know where exactly the input is coming from here, as you mentioned.
That's still just a single branch compared to 4, so... Yay? 😁
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, let's do it like that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, done in 0577690. Also added a detailed comment for future reference 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment on BitLengthOrder
also applies here. Additionally, if we already on the microoptimization road, we should cache the value of toReverse >> 12
to a local, since we are using it twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 4b0dfd1.
Also I should really stress out how if we care about micro-optimizations, we should really consider refactoring our Guard
APIs with that trick I suggested. Look at all that asm clutter due to that string.Format
call we have here:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this is just terrible, wish we have noticed that earlier. Guard.MustBeLessThanOrEqualTo
will make this method actually slower than it was before your changes. Can you just use a manual check + ThrowHelper here?
If you have some time, feel free to file a PR in SixLabors/SharedInfrastructure. For the comparison methods I'd rather see a set of duplicate non-generic methods, than TValue : IComparable<TValue>
, which also brings some extra overhead I believe.
src/ImageSharp/Processing/Processors/Quantization/OctreeFrameQuantizer{TPixel}.cs
Show resolved
Hide resolved
@@ -40,8 +40,6 @@ internal sealed unsafe class DeflaterHuffman : IDisposable | |||
// probability, to avoid transmitting the lengths for unused bit length codes. | |||
private static readonly int[] BitLengthOrder = { 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might well be better off using the ReadOnlySpan<byte>
here also and casting.
I know we do that in the jpeg encoder for uint
and BitCountLut
with, as I recall, a net gain in performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5b7ac75.
Should I add some bounds check here, or are all those accesses guaranteed to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't tell you for certain I'm afraid. That port was an act of desperation over understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Should I just switch those accesses to a direct Span<T>[int]
access then? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we lack understanding, we shall go for safety.
Btw if we use the data in Bit4Reverse
for indexing, elements shouldn't be byte
-s. It's more work for the CPU:
https://sharplab.io/#v2:C4LghgzgtgPgAgJgIwFgBQcAMACbckB0ASgK4B2wAllAKYEDCA9lAA6UA2NATgMrcBulAMY0IAbnTo4AZjwJs9bAG90uXKrUsulfmGA08SAGzYARgE992AJJkAJjQAet4NIQAKLjQBmZywZYAGmxKChD7JwBKDVwVNDUEvAB2bABVMggwbzoAQTs7Tx9sIPCHR0iJeLUAXxjsOq0dPQN8EwsrWzKAIX9C33aA4IHSqLq4xLU4FPTM7II8gq9fEtCyirratGqgA==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get what you're saying now, yeah 👍
I'm not sure I understand where you're seeing that with Bit4Reverse
though.
Here's where that is used:
ImageSharp/src/ImageSharp/Formats/Png/Zlib/DeflaterHuffman.cs
Lines 384 to 392 in 4b0dfd1
int toReverseRightShiftBy12 = toReverse >> 12; | |
Guard.MustBeLessThanOrEqualTo<uint>((uint)toReverseRightShiftBy12, 15, nameof(toReverse)); | |
ref byte bit4ReverseRef = ref MemoryMarshal.GetReference(Bit4Reverse); | |
return (short)(Unsafe.Add(ref bit4ReverseRef, toReverse & 0xF) << 12 | |
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 4) & 0xF) << 8 | |
| Unsafe.Add(ref bit4ReverseRef, (toReverse >> 8) & 0xF) << 4 | |
| Unsafe.Add(ref bit4ReverseRef, toReverseRightShiftBy12)); |
All the indexing values there are int
already 🤔
I mean, those are actually the same indices that were being used before as well. I'm not sure I'm following here, where is the byte
offsetting peformed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh sorry, I was wrong the values for Bit4Reverse
are not used for indexing into data. This is the data that gets indexed.
We should deal with Guard however to make sure we don't regress. Do you plan to fix it globally in SixLabors/SharedInfrastructure, or will you replace it in this PR with a manual check + ThrowHelper calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahahah no problem, glad we could figure this out, I was wondering what was I missing there 😄
As for Guard
, I'd say that's an unrelated improvement, since it's not something that will impact the functionality in any way, it's just a matter of further optimizations. I guess the best thing would be to just make another PR and fix the entire Guard
class, with all the included APIs. Sounds good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can do it, that would be the best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, always happy to bring more of my discoveries and optimizations to ImageSharp! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JimBobSquarePants Whoops, fixed in 68387a7, my bad 😅 |
Codecov Report
@@ Coverage Diff @@
## master #1133 +/- ##
==========================================
- Coverage 82.24% 82.23% -0.01%
==========================================
Files 678 678
Lines 29164 29192 +28
Branches 3284 3284
==========================================
+ Hits 23986 24007 +21
- Misses 4481 4488 +7
Partials 697 697
Continue to review full report at Codecov.
|
@Sergio0694 @antonfirsov Are we waiting for this to be completed and merged before finishing here? |
@JimBobSquarePants Not sure whether we necessarily have to wait for that. I mean, having that would reduce the overhead in those checks, but it wouldn't actually change anything semantically. We might want to wait for it though in case the changes in this PR alone have too much of a performance regression, but that shouldn't be the case (plus it'd be temporary anyways). Up to you guys 👍 |
@JimBobSquarePants I expect the aggregate result of the changes in On the other hand, I'm fine fixing the regression in a follow-up PR. |
@Sergio0694 I found the AccessViolation. It's not us it's the ImageMagick reference codec. @dlemstra You'll want to see this. |
@JimBobSquarePants That's awesome, great work! Props to your investigative skills 😄 |
Ok.... The submodule is now updated and I've added updated relevant tests.
Will merge once this finishes building. Spoke too soon... The templates don't automatically run on Unix of course because there's no Visual Studio. Will have another think. Update: |
@@ -90,7 +90,7 @@ public void AddNewFrame_Frame_FramesNotBeNull() | |||
this.Collection.AddFrame(null); | |||
}); | |||
|
|||
Assert.StartsWith("Value cannot be null.", ex.Message); | |||
Assert.StartsWith("Parameter \"source\" must be not null.", ex.Message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also do Assert.Throws<ArgumentNullException>("source", () =>
here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that was dealt with above?
Prerequisites
Description
This PR includes a couple of minor optimizations:
static readonly byte[]
arrays with staticReadOnlySpan<byte>
values that directly map over constant data in the.text
segment (see here).Encoding.ASCII.GetBytes
, no longer necessaryStream.Write
calls to theReadOnlySpan<byte>
overloads.No rush at all for this, just saw those
byte[]
arrays and thought I'd have some fun optimizing them away. Pinging @antonfirsov for the reviews as he's the official performance guru. 😄