-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow the user to control the MaxVectorTBitWidth #85551
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis resolves #85543 and allows a bit more fine grained control without requiring an ISA to be disabled. This will also allow
|
This doesn't yet cover a corresponding switch for NAOT/R2R. |
src/coreclr/vm/codeman.cpp
Outdated
// Some architectures can experience frequency throttling when executing | ||
// executing 512-bit width instructions. To account for this we set the | ||
// default preferred vector width to 256-bits in some scenarios. Users | ||
// can override this with `DOTNET_PreferredVectorWith=512`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM/GCC actually extend this prefer-vector-width=256
behavior all the way up to the latest generation.
However, the general throttling issue has been fixed since Ice Lake and there are only a few instructions with "false dependencies" that can still cause some slowdown if used incorrectly. Since we don't have a general purpose auto-vectorizer and really just use this to control simple memory operations and comparisons, we should be fine limiting it to just the below.
Thanks for this Tanner. |
@dotnet/jit-contrib |
Resolved merge conflicts |
/benchmark fortunes_ef aspnet-citrine-win runtime |
Benchmark started for fortunes_ef on aspnet-citrine-win with runtime. Logs: link |
Ping @dotnet/jit-contrib, @jkotas for review/feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good to me. I had a couple suggestions on namings that (unfortunately) would be somewhat pervasive. You can decide whether to take them or not.
Presumably, where before using arm64 altjit on x64 we would set DOTNET_SIMD16ByteOnly=1
, now we would set DOTNET_MaxVectorTBitWidth=128
?
Have you given any thought to how this should work with AOT compilers?
|
Right. My thought was that much as you can pass This would function much as the environment variables do and just help instruct how codegen should occur. It would choose the minimum of the value passed in by the user and the largest actually supported given the target ISAs. So, if you said It then functions essentially identically to how the JIT functions in this PR, just in R2R/NAOT instead. What is a bit "unclear" is how exactly |
It should work same way as other instruction sets specified at AOT time. The system can treat Vector128/256/512 as a "virtual" vector instruction set. It is defined like that in https://github.com/dotnet/runtime/blob/main/src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt#L92-L94.
We do have infrastructure to throw away just functions which mismatch (look for |
So you're indicating rather than
My understanding is that this doesn't quite work as expected and has a number of larger work items pending. One example the general issue that if you pre-compile for I expect that trying to plug the vector size information into this same thing may likewise be problematic today. From #61471 (comment):
|
I've converted this to a draft while I work through the last of the issues. I believe I've nearly got everything handled now. |
c733164
to
079e9b0
Compare
fb22192
to
69e496a
Compare
2f9fde4
to
b0deccd
Compare
…GetJitCpuCapabilityFlags
This should be ready for review again. It's been trimmed down to just the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updated checkin comment, and for splitting up this work into the set of PRs you made. It now looks good.
This will also allow
Vector<T>
to eventually be larger than 256-bits via explicit opt-in and for the user to explicitly opt-in to a smaller size, if desired, without requiring ISA disablementTo achieve this we've done three primary things:
Vector<T>
or transitively has aVector<T>
field. Such structs are marked as requiring a type layout check on loadVector<T>
size as an instruction set flag. Only oneVector<T>
size is allowed to be specified at a time and we have a set of asserts validating that is the caseVector<T>
via an environment variable (JIT) or command line switch (CG2/NAOT). This plays into the selectedVector<T>
size