-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Using struct for VectorPacket in PacketTracer benchmark #19662
Conversation
Does this PR also look good to you? @tannergooding @CarolEidt @AndyAyersMS |
@@ -10,7 +10,7 @@ | |||
using System.Runtime.CompilerServices; | |||
using System; | |||
|
|||
internal class VectorPacket256 | |||
internal struct VectorPacket256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using in
, this should probably be readonly struct
to ensure you don't incur any hidden copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update VectorPacket256
fields in the loop https://github.com/dotnet/coreclr/pull/19662/files#diff-68935896f95ba1d13b35f248c35f1e63R173
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, seems we can optimize here, will try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you may be incurring hidden copies when passing the value as in
. I would recommend checking this with the ErrorProne.NET
analyzer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you may be incurring hidden copies when passing the value as in.
Probably not, that loop just updates a local struct. Will try to eliminate the field update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to eliminate the field update of local structs via updating local vectors then new
the readonly
VectorPacket by the local vectors.
This change makes a little bit slower (0.75s vs 0.74s) and a bit code size regression due to callee-saved SIMD register prolog.
So, I think mutable struct
is okay for this program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I have updated this PR to use pass-by-value.
What happens if you just change from class to struct but don't change the parameters to |
@AndyAyersMS Will try, thanks. |
@AndyAyersMS The pass-by-value version is a little bit slower than using |
Do you have a |
PMI jit-diff shows 15.39% code size shrink from the struct promotion change (the pass-by-ref version gets 16.26%). The code size regression of
|
Can I use jit-diff to show codegen diff between two different C# programs (pass-by-value vs pass-by-ref) with the same JIT compiler? |
Yes. It requires a few steps, something like:
and/or run your own diff tool on the two files |
@AndyAyersMS Thanks for teaching. The pass-by-value version has 1.64% code size regression.
|
The changes modified signatures so the methods don't quite line up -- but you can still diff the corresponding versions manually. Might be interesting to see if there's a theme that emerges from the diffs. What about perf? |
@AndyAyersMS The pass-by-value version is a little bit slower than using in. Execution time 0.74s v.s 0.72s on Core i7. I will look into the codegen diff and get VTune data later. |
Ah, right -- you already mentioned that. My sense is we should just go with the simpler version. |
@AndyAyersMS Don't you mean just use pass-by-value in this PR? |
Yes, just the change from |
Thanks, will do. |
I agree with @AndyAyersMS that we should go with the pass-by-value version. That said, it would be great to determine whether the main additional cost is due to the lack of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@AndyAyersMS, did you have any additional feedback here, or are we good to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge.
This PR changes the PacketTracer benchmark to use
struct
instead ofclass
forVectorPacket
.This will make the benchmark 31% faster with JIT change #19663 (2x slower without the JIT change) of struct promotion because it reduces the GC overhead mentioned in https://github.com/dotnet/coreclr/issues/19116.