-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Optimize "X & 1 == 0" to "X & 1" #61412
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue Detailsstatic bool Test(int x) => (x & 1) == 0; Just a good first issue for anyone interested in contributing to CLR JIT. Current codegen: test cl, 1
sete al
movzx rax, al
ret Expected codegen: mov eax, ecx
and eax, 1
ret I noticed it when I was trying to implement a faster IsNegative for floats: static bool IsNegative(double x) =>
(Sse2.MoveMask(Vector128.CreateScalarUnsafe(x)) & 1) != 0;
|
@EgorBo Hey, I'm interested to work with it, but I'll need your help because I'm not experienced in building and debugging the CLR. |
Cool! Then let me experiment on you to understand how good and first-time-contributor friendly our docs are 🙂 |
oh, I'm interested in contributing as well, I found a lot of minor code generation issues that I'd like to improve... here is an example:
lea instructions are not necessary; rdi+18h should not be read multiple times, etc. I also want more "cmov" instructions instead of chains of jcc blocks with the ? : ternary operator when all are constants. So yeah, I have a lot of asm optimization ideas on my list. Anyway... I know how to debug the JIT, but is there some documentation on how to extract the JIT generated code? DotnetBenchmark does somewhat of a job, but I don't know how to control the disassembly output. Visual Studio surely also allows copy & paste from the disassembly window, but I was more thinking of some form of automation first to find areas of "bad code" gen. Is there some documentation on how to do this? How did you extract the code snippet above? |
@hopperpl usually this is done by using the various environment variables, e. g. for getting disassembly it would be
I am not aware of such automation. There is an assembly scanner tool: https://github.com/dotnet/jitutils/tree/main/src/AnalyzeAsm, but in general the addition of peepholes in the Jit is done in an ad-hoc manner (with exceptions of course). |
Maybe I'm missing something, but isn't this optimization wrong? Here's what Clang generates: |
@EgorBo Thank you. The guide was helpful even though I needed to do some extra research to make it work :) I managed to set a breakpoint in |
Could you please share more details what exactly was needed so we can improve the doc?
you can take a look at the existing optimizations in morph.cpp, e.g. this one #52524 (it's for
Yes, thanks, it was a typo
The best way is to file an issue for a specific pattern where we can discuss it and give you pointers where to look at. E.g. 'cmov' related optimizations aren't as easy as they sound. For the assembly code you can try my VS2022 add-in Disasmo: https://github.com/EgorBo/Disasmo |
|
this was my experience... I was not able to build the runtime as instructed here https://github.com/dotnet/runtime/tree/main/docs/workflow/building/coreclr I'm not sure if it is related to #60061 or not. I think I received a different error about an invalid parameter (cmake). It referred me to log files but there were no files in the log directory. This "Send-VsDevShellTelemetry" is quite noisy and confused me a lot until I noticed that this was not the cause of the error, it just pulled all my attention. Then I switched to -msbuild instead as indirectly suggested by the doc. This worked without a problem. Regarding the "How to Debug" instruction, this worked like described. A hint about the needed storage size for the build would be nice. I need 2.5 GB for the source and then 24 GB on top for the build. About 8 GB per configuration (Debug, Checked, Release) |
* Add optimization "X & 1 == 1" to "X & 1" (#61412) * Moved the optimization to the morph phase (#61412) * Done in post-order (#61412) * Moved the optimization into fgOptimizeEqualityComparisonWithConst (#61412) * Some corrections due the comments (#61412) * Fix of the picture (#61412) * Add optNarrowTree use (#61412) * Change narrowing to the type check (#61412) * Fix regressions (#61412) * Moved the optimization to the lowering phase (#61412) * Reverted Morph changes (#61412) * Moved the optimization into OptimizeConstCompare method (#61412) * Add GT_EQ check(#61412)
* Add optimization "X & 1 == 1" to "X & 1" (dotnet#61412) * Moved the optimization to the morph phase (dotnet#61412) * Done in post-order (dotnet#61412) * Moved the optimization into fgOptimizeEqualityComparisonWithConst (dotnet#61412) * Some corrections due the comments (dotnet#61412) * Fix of the picture (dotnet#61412) * Add optNarrowTree use (dotnet#61412) * Change narrowing to the type check (dotnet#61412) * Fix regressions (dotnet#61412) * Moved the optimization to the lowering phase (dotnet#61412) * Reverted Morph changes (dotnet#61412) * Moved the optimization into OptimizeConstCompare method (dotnet#61412) * Add GT_EQ check(dotnet#61412)
This can be closed @EgorBo |
Was this fully addressed? The PR that was cited as closing this addressed the case: static bool Equal1(int x) => (x & 1) == 1; turning this on .NET 6: test dl,1
setne al
movzx eax,al
ret into this on .NET 7: mov eax,edx
and eax,1
ret but for the original example: static bool Equal0(int x) => (x & 1) == 0; on .NET 6 I get: test dl,1
sete al
movzx eax,al
ret and on .NET 7 I get: xor eax,eax
test dl,1
sete al
ret Wouldn't we expect it to be more like: mov eax,edx
not eax
and eax,1
ret ? The PR also didn't seem to address the case of |
Sorry, this might have been a disinformation from me then. I've been looking at help-wanted issues and some of them were still active but with linked pull requests already merged like this one. Should we add a new tag like pr-merged-on-trial so new contributors could see that issue is sorta resolved but needs confirmation. Now you have to check it manually. Should we reopen this or open a new separate issue? @SkiFoD are you willing to work on this some more? |
Let's reopen this; the case of |
@En3Tho Hey, if you want to continue working on the issue I don't mind :) |
I will try. Thanks. |
I guess if your change is a broader one then we should go with it? Also, you've already received reviews on it. Mine covers only x & 1 =/!= 1/0 and it's a new one. |
Yeah, I think that makes sense. On the other hand, my PR doesn't handle comparisons to zero ( |
Yeah my pr handles those cases. I guess we should let team decide what to merge. Maybe you can incorporate my changes somehow (if they are acceptable), dunno. |
Just a good first issue for anyone interested in contributing to CLR JIT.
Current codegen:
Expected codegen:
I noticed it when I was trying to implement a faster IsNegative for floats:
The text was updated successfully, but these errors were encountered: