-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert math intrinsics to named intrinsics #39730
Convert math intrinsics to named intrinsics #39730
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
cc @dotnet/jit-contrib |
x86 timeout failure is #38847. |
@dotnet/jit-contrib Any comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looking so may have more feedback later.
Have you run diffs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the switch issue, this looks good. I had a couple other small comments.
Would be good to verify this is no diff for both crossgen and PMI across the Fx set, at least for one architecture (crossgen2 perhaps, also?).
Validating this is a bit more complex with a guid change: you will need to run base and diff individually on the base and diff builds, and compare the results afterwards.
Does $ pushd runtime_base; ./build.sh -c Release ; popd
$ pushd runtime_pr; ./build.sh -c Release ; ./src/coreclr/build-test.sh -generatelayoutonly -release; popd
$ mkdir /tmp/diffs
$ jitutils/bin/jit-diff diff \
--base runtime_base/artifacts/bin/coreclr/OSX.x64.Release \
--diff runtime_pr/artifacts/bin/coreclr/OSX.x64.Release \
--core_root runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root \
--output /tmp/diffs --pmi
No assemblies specified; defaulting to corelib
Beginning PMI CodeSize Diffs for System.Private.CoreLib.dll
- Finished 0/1 Base 0/1 Diff [0.1 sec]Error running /Users/am11/projects/runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root/corerun on /Users/am11/projects/runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root/System.Private.CoreLib.dll
1 errors compiling set.
Dasm command "jit-dasm-pmi --platform runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root --corerun runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root/corerun --jit runtime_base/artifacts/bin/coreclr/OSX.x64.Release/libclrjit.dylib --nocopy --output /tmp/diffs/dasmset_7/base /Users/am11/projects/runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root/System.Private.CoreLib.dll" returned with 1 failures
\ Finished 1/1 Base 1/1 Diff [12.4 sec]
Dasm commands returned 1 base failures, 0 diff failures.
Completed PMI CodeSize Diffs for System.Private.CoreLib.dll in 12.37s
Diffs (if any) can be viewed by comparing: /tmp/diffs/dasmset_7/base /tmp/diffs/dasmset_7/diff
Analyzing CodeSize diffs...
Found 1 files with textual diffs.
PMI CodeSize Diffs for System.Private.CoreLib.dll for default jit
Summary of Code Size diffs:
(Lower is better)
Warning: the base metric is 0, the diff metric is 0, have you used a release version?
0 total files with Code Size differences (0 improved, 0 regressed), 1 unchanged.
0 total methods with Code Size differences (0 improved, 0 regressed), 0 unchanged.
1 files had text diffs but no metric diffs.
System.Private.CoreLib.dasm had 5 diffs
Completed analysis in 0.19s
$ cat /tmp/diffs/dasmset_7/base/System.Private.CoreLib.dasm
Fatal error. Failed to load JIT compiler Disclaimer: this my first time with jit-diff, so i might be missing something rudimentary. 😁 |
Since your I'd try something like
and then repeat the above, leaving out [edit: fixed a couple typos in the above] |
Thanks for the steps @AndyAyersMS. :)
After taking the diffs of base and pr individually, I ended up with $ jitutils/bin/jit-analyze --base /tmp/diffs/dasmset_1/base --diff /tmp/diffs/dasmset_2/diff it failed with these errors: Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/CommandLine.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Dia2Lib.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/ILCompiler.Reflection.ReadyToRun.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.CSharp.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.CodeAnalysis.CSharp.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.CodeAnalysis.VisualBasic.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.CodeAnalysis.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.Diagnostics.FastSerialization.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.Diagnostics.NETCore.Client.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.Diagnostics.Tools.RuntimeClient.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.Diagnostics.Tracing.TraceEvent.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.Extensions.Caching.Abstractions.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist
Couldn't parse --numstat output '1 1 /tmp/diffs/{dasmset_2/diff => dasmset_1/base}/Microsoft.Extensions.Caching.Memory.dasm` : '/tmp/diffs/{dasmset_2/diff' does not exist (note that it is adding a mysterious |
jit-analyze uses git diff under the covers and sometimes git diff decides there are renames and messes up our parsing. You should move the diff directory under dasmset_1 and rerun jit-analyze with the new location. |
Thank you! I first tried creating symlinks manually, but didn't work: $ ln -s /tmp/diffs/dasmset_2/diff/ /tmp/diffs/diff
$ ln -s /tmp/diffs/dasmset_1/base/ /tmp/diffs/base
$ jitutils/bin/jit-analyze --base /tmp/diffs/base --diff /tmp/diffs/diff
Couldn't parse --numstat output '1 1 /tmp/diffs/{diff => base}` : '/tmp/diffs/base' does not exist (looks like jit-analyze is strict about it worked when i moved the directories. PMI $ jitutils/bin/jit-diff diff --base runtime_base/artifacts/bin/coreclr/OSX.x64.Release --base_root runtime_base --core_root runtime_base/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root --output /tmp/diffs --pmi -f
$ jitutils/bin/jit-diff diff --diff runtime_pr/artifacts/bin/coreclr/OSX.x64.Release --diff_root runtime_pr --core_root runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root --output /tmp/diffs --pmi -f
$ mv /tmp/diffs/dasmset_2/diff/ /tmp/diffs/diff
$ mv /tmp/diffs/dasmset_1/base/ /tmp/diffs/base
$ jitutils/bin/jit-analyze --base /tmp/diffs/base --diff /tmp/diffs/diff
Found 265 files with textual diffs.
Summary of Code Size diffs:
(Lower is better)
Warning: the base metric is 0, the diff metric is 0, have you used a release version?
0 total files with Code Size differences (0 improved, 0 regressed), 265 unchanged.
0 total methods with Code Size differences (0 improved, 0 regressed), 0 unchanged.
265 files had text diffs but no metric diffs.
System.Security.Cryptography.Algorithms.dasm had 4 diffs
System.Security.Cryptography.OpenSsl.dasm had 4 diffs
Microsoft.Extensions.Primitives.dasm had 2 diffs
System.Xml.Linq.dasm had 2 diffs
System.ComponentModel.Primitives.dasm had 2 diffs
Dia2Lib.dasm had 2 diffs
System.Net.WebSockets.Client.dasm had 2 diffs
System.IO.Compression.FileSystem.dasm had 2 diffs
System.Data.Common.dasm had 2 diffs
System.Threading.Timer.dasm had 2 diffs
System.Security.Cryptography.Cng.dasm had 2 diffs
System.Runtime.Serialization.Primitives.dasm had 2 diffs
System.Security.dasm had 2 diffs
Microsoft.Extensions.Configuration.Abstractions.dasm had 2 diffs
System.Data.DataSetExtensions.dasm had 2 diffs
System.ComponentModel.Composition.Registration.dasm had 2 diffs
System.Net.WebClient.dasm had 2 diffs
System.Globalization.dasm had 2 diffs
Microsoft.Extensions.FileProviders.Abstractions.dasm had 2 diffs
System.Console.dasm had 2 diffs crossgen: $ jitutils/bin/jit-diff diff --base runtime_base/artifacts/bin/coreclr/OSX.x64.Release --base_root runtime_base --core_root runtime_base/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root --output /tmp/diffs --crossgen runtime_base/artifacts/bin/coreclr/OSX.x64.Release/crossgen -f
$ jitutils/bin/jit-diff diff --diff runtime_pr/artifacts/bin/coreclr/OSX.x64.Release --diff_root runtime_pr --core_root runtime_pr/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root --output /tmp/diffs --crossgen runtime_pr/artifacts/bin/coreclr/OSX.x64.Release/crossgen -f
$ mv /tmp/diffs/dasmset_1/base/ /tmp/diffs/base
$ mv /tmp/diffs/dasmset_2/diff/ /tmp/diffs/diff
$ jitutils/bin/jit-analyze --base /tmp/diffs/base --diff /tmp/diffs/diff
Summary of Code Size diffs:
(Lower is better)
Warning: the base metric is 0, the diff metric is 0, have you used a release version?
0 total files with Code Size differences (0 improved, 0 regressed), 265 unchanged.
0 total methods with Code Size differences (0 improved, 0 regressed), 0 unchanged. Crossgen2 doesn't have an executable produced on macOS, so I tried with crossgen2.dll, which fails: $ jitutils/bin/jit-diff diff --diff runtime_base/artifacts/bin/coreclr/OSX.x64.Release --diff_root runtime_base --core_root runtime_base/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root --output /tmp/diffs --crossgen runtime_base/artifacts/bin/coreclr/OSX.x64.Release/crossgen2/crossgen2.dll -f
Beginning Crossgen CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies
- Finished 0/0 Base 0/265 Diff [0.1 sec]Unhandled exception. System.ComponentModel.Win32Exception (13): Permission denied
at System.Diagnostics.Process.ForkAndExecProcess(String filename, String[] argv, String[] envp, String cwd, Boolean redirectStdin, Boolean redirectStdout, Boolean redirectStderr, Boolean setCredentials, UInt32 userId, UInt32 groupId, UInt32[] groups, Int32& stdinFd, Int32& stdoutFd, Int32& stderrFd, Boolean usesTerminal, Boolean throwOnNoExec)
at System.Diagnostics.Process.StartCore(ProcessStartInfo startInfo)
at System.Diagnostics.Process.Start()
at Microsoft.DotNet.Cli.Utils.Command.Execute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking the diffs.
This message looks suspect. Was there actually disassembly found in the diff files? |
Ah, good point. Please don't pass a path to instead of this:
run this:
and likewise for the diff runs. You will also need to have checked builds of the runtime for both repos. |
With: $ jitutils/bin/jit-diff diff \
--base \
--base_root runtime_base \
--core_root runtime_base/artifacts/tests/coreclr/OSX.x64.Release/Tests/Core_Root \
--output /tmp/diffs --pmi -f it complains: error: didn't find --base default
error: Specify either --base or --diff or both.
I will do that now. |
It worked without $ jitutils/bin/jit-analyze --base /tmp/diffs/base --diff /tmp/diffs/diff
Found 278 files with textual diffs.
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: 360 (0.001% of base)
diff is a regression.
Top file regressions (bytes):
138 : System.Private.CoreLib.dasm (0.003% of base)
79 : System.Private.Xml.dasm (0.002% of base)
57 : System.Data.Common.dasm (0.004% of base)
52 : Newtonsoft.Json.dasm (0.006% of base)
17 : System.ComponentModel.TypeConverter.dasm (0.006% of base)
6 : Microsoft.VisualBasic.Core.dasm (0.001% of base)
4 : System.Runtime.Caching.dasm (0.007% of base)
4 : System.Drawing.Primitives.dasm (0.011% of base)
3 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.000% of base)
3 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (0.000% of base)
Top file improvements (bytes):
-1 : System.Drawing.Common.dasm (-0.000% of base)
-1 : ILCompiler.Reflection.ReadyToRun.dasm (-0.001% of base)
-1 : System.Net.Mail.dasm (-0.001% of base)
13 total files with Code Size differences (3 improved, 10 regressed), 252 unchanged.
Top method regressions (bytes):
80 (3.120% of base) : System.Private.Xml.dasm - BigNumber:DblToRgbFast(double,ref,byref,byref):bool
60 (16.129% of base) : System.Private.CoreLib.dasm - MemoryFailPoint:.ctor(int):this
57 (7.692% of base) : System.Data.Common.dasm - SqlDecimal:.ctor(double):this
31 (140.909% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:Reminder(double,double):double
27 (122.727% of base) : Newtonsoft.Json.dasm - JsonValidatingReader:FloatingPointRemainder(double,double):double
25 (2.660% of base) : Newtonsoft.Json.dasm - JsonValidatingReader:ValidateFloat(JsonSchemaModel):this
22 (4.175% of base) : System.Private.CoreLib.dasm - Vector`1:Ceiling(Vector`1):Vector`1 (6 methods)
22 (4.175% of base) : System.Private.CoreLib.dasm - Vector`1:Floor(Vector`1):Vector`1 (6 methods)
18 (37.500% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:NormalizeLongitude(double):double
11 (18.644% of base) : System.ComponentModel.TypeConverter.dasm - Timer:UpdateTimer():this
8 (1.023% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Int(Object):Object
6 (0.766% of base) : System.Private.CoreLib.dasm - Math:Round(double,int,int):double
6 (0.762% of base) : System.Private.CoreLib.dasm - MathF:Round(float,int,int):float
4 (2.010% of base) : System.Runtime.Caching.dasm - MemoryCacheStore:TrimInternal(int):long:this
4 (22.222% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Int(float):float
3 (0.492% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - CompileTimeCalculations:ConvertFloatingValue(double,ubyte,byref):ConstantValue
3 (0.718% of base) : System.ComponentModel.TypeConverter.dasm - Timer:set_Enabled(bool):this
3 (0.783% of base) : System.ComponentModel.TypeConverter.dasm - Timer:.ctor(double):this
3 (0.177% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - ThreadState:LogBlockingEnd(double,int,int,TraceThread,ThreadTimeStackComputer):this
2 (2.667% of base) : System.Private.Xml.dasm - NumberFunctions:Floor(XPathNodeIterator):double:this
Top method improvements (bytes):
-15 (-0.747% of base) : System.Private.CoreLib.dasm - Number:Dragon4(long,int,int,bool,int,bool,Span`1,byref):int
-8 (-3.922% of base) : System.Private.CoreLib.dasm - TimeSpanToken:NormalizeAndValidateFraction():bool:this
-5 (-1.323% of base) : System.Private.Xml.dasm - NumberingFormat:ConvertToArabic(double,int,int,String):String
-2 (-3.509% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Fix(double):double
-2 (-2.740% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Fix(float):float
-2 (-20.000% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Int(double):double
-1 (-2.083% of base) : System.Drawing.Common.dasm - Font:get_Height():int:this
-1 (-0.067% of base) : ILCompiler.Reflection.ReadyToRun.dasm - GcInfo:GetTransitions(ref,byref):Dictionary`2:this
-1 (-0.361% of base) : System.Net.Mail.dasm - SmtpDateTime:TimeSpanToOffset(TimeSpan):String:this
-1 (-0.559% of base) : System.Private.CoreLib.dasm - Grisu3:GetCachedPowerForBinaryExponentRange(int,int,byref):DiyFp
-1 (-0.885% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:GetGregorianYear(double):int
-1 (-0.435% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:PersianNewYearOnOrBefore(long):long
-1 (-0.252% of base) : System.Private.CoreLib.dasm - PersianCalendar:GetDatePart(long,int):int:this
Top method regressions (percentages):
31 (140.909% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:Reminder(double,double):double
27 (122.727% of base) : Newtonsoft.Json.dasm - JsonValidatingReader:FloatingPointRemainder(double,double):double
18 (37.500% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:NormalizeLongitude(double):double
4 (22.222% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Int(float):float
11 (18.644% of base) : System.ComponentModel.TypeConverter.dasm - Timer:UpdateTimer():this
60 (16.129% of base) : System.Private.CoreLib.dasm - MemoryFailPoint:.ctor(int):this
57 (7.692% of base) : System.Data.Common.dasm - SqlDecimal:.ctor(double):this
22 (4.175% of base) : System.Private.CoreLib.dasm - Vector`1:Ceiling(Vector`1):Vector`1 (6 methods)
22 (4.175% of base) : System.Private.CoreLib.dasm - Vector`1:Floor(Vector`1):Vector`1 (6 methods)
80 (3.120% of base) : System.Private.Xml.dasm - BigNumber:DblToRgbFast(double,ref,byref,byref):bool
2 (2.667% of base) : System.Private.Xml.dasm - NumberFunctions:Floor(XPathNodeIterator):double:this
2 (2.667% of base) : System.Private.Xml.dasm - NumberFunctions:Ceiling(XPathNodeIterator):double:this
25 (2.660% of base) : Newtonsoft.Json.dasm - JsonValidatingReader:ValidateFloat(JsonSchemaModel):this
4 (2.010% of base) : System.Runtime.Caching.dasm - MemoryCacheStore:TrimInternal(int):long:this
2 (1.613% of base) : System.Drawing.Primitives.dasm - Rectangle:Ceiling(RectangleF):Rectangle
1 (1.587% of base) : System.Drawing.Primitives.dasm - Point:Ceiling(PointF):Point
1 (1.587% of base) : System.Drawing.Primitives.dasm - Size:Ceiling(SizeF):Size
8 (1.023% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Int(Object):Object
3 (0.783% of base) : System.ComponentModel.TypeConverter.dasm - Timer:.ctor(double):this
6 (0.766% of base) : System.Private.CoreLib.dasm - Math:Round(double,int,int):double
Top method improvements (percentages):
-2 (-20.000% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Int(double):double
-8 (-3.922% of base) : System.Private.CoreLib.dasm - TimeSpanToken:NormalizeAndValidateFraction():bool:this
-2 (-3.509% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Fix(double):double
-2 (-2.740% of base) : Microsoft.VisualBasic.Core.dasm - Conversion:Fix(float):float
-1 (-2.083% of base) : System.Drawing.Common.dasm - Font:get_Height():int:this
-5 (-1.323% of base) : System.Private.Xml.dasm - NumberingFormat:ConvertToArabic(double,int,int,String):String
-1 (-0.885% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:GetGregorianYear(double):int
-15 (-0.747% of base) : System.Private.CoreLib.dasm - Number:Dragon4(long,int,int,bool,int,bool,Span`1,byref):int
-1 (-0.559% of base) : System.Private.CoreLib.dasm - Grisu3:GetCachedPowerForBinaryExponentRange(int,int,byref):DiyFp
-1 (-0.435% of base) : System.Private.CoreLib.dasm - CalendricalCalculationsHelper:PersianNewYearOnOrBefore(long):long
-1 (-0.361% of base) : System.Net.Mail.dasm - SmtpDateTime:TimeSpanToOffset(TimeSpan):String:this
-1 (-0.252% of base) : System.Private.CoreLib.dasm - PersianCalendar:GetDatePart(long,int):int:this
-1 (-0.067% of base) : ILCompiler.Reflection.ReadyToRun.dasm - GcInfo:GetTransitions(ref,byref):Dictionary`2:this
37 total methods with Code Size differences (13 improved, 24 regressed), 236438 unchanged. |
e.g. the regression in click to expand-; Total bytes of code 22, prolog size 3, PerfScore 29.50, (MethodHash=3ec4c25b) for method CalendricalCalculationsHelper:Reminder(double,double):double
+; Total bytes of code 53, prolog size 7, PerfScore 32.50, (MethodHash=3ec4c25b) for method CalendricalCalculationsHelper:Reminder(double,double):double
; ============================================================
Unwind Info:
@@ -906600,11 +906625,12 @@ Unwind Info:
>> End offset : 0xd1ffab1e (not in unwind data)
Version : 1
Flags : 0x00
- SizeOfProlog : 0x00
- CountOfUnwindCodes: 0
+ SizeOfProlog : 0x04
+ CountOfUnwindCodes: 1
FrameRegister : none (0)
FrameOffset : N/A (no FrameRegister) (Value=0)
UnwindCodes :
+ CodeOffset: 0x04 UnwindOp: UWOP_ALLOC_SMALL (2) OpInfo: 2 * 8 + 8 = 24 = 0x18
; Assembly listing for method CalendricalCalculationsHelper:NormalizeLongitude(double):double
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
; optimized code
@@ -906612,30 +906638,37 @@ Unwind Info:
; partially interruptible
; Final local variable assignments
;
-; V00 arg0 [V00,T00] ( 9, 8 ) double -> mm0
+; V00 arg0 [V00,T00] ( 9, 8 ) double -> [rsp+0x00]
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
-; V02 tmp1 [V02,T01] ( 2, 2 ) double -> mm1 "Inline stloc first use temp"
+; V02 tmp1 [V02,T01] ( 2, 2 ) double -> mm0 "Inline stloc first use temp"
;
-; Lcl frame size = 0
+; Lcl frame size = 8
G_M36341_IG01:
+ push rax
vzeroupper
- ;; bbWeight=1 PerfScore 1.00
+ ;; bbWeight=1 PerfScore 2.00
G_M36341_IG02:
- vdivsd xmm1, xmm0, qword ptr [reloc @RWD08]
- vroundsd xmm1, xmm1, 9
- vmulsd xmm1, xmm1, qword ptr [reloc @RWD24]
- vsubsd xmm0, xmm0, xmm1
- vxorps xmm1, xmm1
- vucomisd xmm1, xmm0
+ vmovsd qword ptr [rsp], xmm0
+ vdivsd xmm0, xmm0, qword ptr [reloc @RWD08]
+ call Math:Floor(double):double
+ vmulsd xmm0, xmm0, qword ptr [reloc @RWD24]
+ vmovsd xmm1, qword ptr [rsp]
+ vsubsd xmm1, xmm1, xmm0
+ vxorps xmm0, xmm0
+ vucomisd xmm0, xmm1
jbe SHORT G_M36341_IG04
- ;; bbWeight=1 PerfScore 31.33
+ ;; bbWeight=1 PerfScore 27.83
G_M36341_IG03:
- vaddsd xmm0, xmm0, qword ptr [reloc @RWD40]
+ vaddsd xmm1, xmm1, qword ptr [reloc @RWD40]
;; bbWeight=0.50 PerfScore 2.50
G_M36341_IG04:
+ vmovaps xmm0, xmm1
+ ;; bbWeight=1 PerfScore 0.25
+G_M36341_IG05:
+ add rsp, 8
ret
- ;; bbWeight=1 PerfScore 1.00
+ ;; bbWeight=1 PerfScore 1.25
RWD00 dq 0000000000000000h
RWD08 dq 4076800000000000h
RWD16 dq 0000000000000000h
@@ -906644,7 +906677,7 @@ RWD32 dq 0000000000000000h
RWD40 dq 4076800000000000h |
This change is expected to produce zero-diff. These diffs are unexpected and suggest that there is a bug somewhere. Could you please look into it? |
Judging by the regression diff, |
Following five functions are not recognized as intrinsic, i.e. they do not even reach
Based on the previous debugging, i think runtime/src/coreclr/src/vm/jitinterface.cpp Line 6693 in 0bed2de
Investigating floor and ceiling cases, which are new. |
By marking floor and ceiling methods @EgorBo, so far what i have understood is (still trying to digest the flow): if an @jkotas, @AndyAyersMS, we have zero code diffs now. :) PMI $ jitutils/bin/jit-analyze --base /tmp/diffs/base --diff /tmp/diffs/diff
Found 266 files with textual diffs.
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: 0 (0.000% of base)
0 total files with Code Size differences (0 improved, 0 regressed), 265 unchanged.
0 total methods with Code Size differences (0 improved, 0 regressed), 236475 unchanged.
1 files had text diffs but no metric diffs.
System.Private.CoreLib.dasm had 76 diffs Expand to see System.Private.CoreLib.dasm textual diffsdiff -u --recursive /tmp/diffs/base/System.Private.CoreLib.dasm /tmp/diffs/diff/System.Private.CoreLib.dasm
--- /tmp/diffs/base/System.Private.CoreLib.dasm 2020-07-24 13:34:42.000000000 +0300
+++ /tmp/diffs/diff/System.Private.CoreLib.dasm 2020-07-24 13:39:02.000000000 +0300
@@ -174935,6 +174935,44 @@
CodeOffset: 0x05 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: r14 (14)
CodeOffset: 0x03 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: r15 (15)
CodeOffset: 0x01 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbp (5)
+; Assembly listing for method SZArrayHelper:IndexOf(ubyte):int:this
+; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
+; optimized code
+; rsp based frame
+; fully interruptible
+; Final local variable assignments
+;
+; V00 this [V00,T00] ( 4, 4 ) ref -> rdi this class-hnd
+; V01 arg1 [V01,T01] ( 3, 3 ) ubyte -> rsi
+;* V02 loc0 [V02 ] ( 0, 0 ) ref -> zero-ref class-hnd
+;# V03 OutArgs [V03 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
+;
+; Lcl frame size = 0
+
+G_M11338_IG01:
+ ;; bbWeight=1 PerfScore 0.00
+G_M11338_IG02:
+ mov ecx, dword ptr [rdi+8]
+ movzx rsi, sil
+ xor edx, edx
+ ;; bbWeight=1 PerfScore 2.50
+G_M11338_IG03:
+ jmp Array:IndexOf(ref,ubyte,int,int):int
+ ;; bbWeight=1 PerfScore 2.00
+
+; Total bytes of code 14, prolog size 0, PerfScore 5.90, (MethodHash=17a0d3b5) for method SZArrayHelper:IndexOf(ubyte):int:this
+; ============================================================
+
+Unwind Info:
+ >> Start offset : 0x000000 (not in unwind data)
+ >> End offset : 0xd1ffab1e (not in unwind data)
+ Version : 1
+ Flags : 0x00
+ SizeOfProlog : 0x00
+ CountOfUnwindCodes: 0
+ FrameRegister : none (0)
+ FrameOffset : N/A (no FrameRegister) (Value=0)
+ UnwindCodes :
; Assembly listing for method TlsOverPerCoreLockedStacksArrayPool`1:Gen2GcCallbackFunc(Object):bool
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
; optimized code
@@ -176033,44 +176071,6 @@
FrameOffset : N/A (no FrameRegister) (Value=0)
UnwindCodes :
CodeOffset: 0x01 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbp (5)
-; Assembly listing for method SZArrayHelper:IndexOf(ubyte):int:this
-; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
-; optimized code
-; rsp based frame
-; fully interruptible
-; Final local variable assignments
-;
-; V00 this [V00,T00] ( 4, 4 ) ref -> rdi this class-hnd
-; V01 arg1 [V01,T01] ( 3, 3 ) ubyte -> rsi
-;* V02 loc0 [V02 ] ( 0, 0 ) ref -> zero-ref class-hnd
-;# V03 OutArgs [V03 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
-;
-; Lcl frame size = 0
-
-G_M11338_IG01:
- ;; bbWeight=1 PerfScore 0.00
-G_M11338_IG02:
- mov ecx, dword ptr [rdi+8]
- movzx rsi, sil
- xor edx, edx
- ;; bbWeight=1 PerfScore 2.50
-G_M11338_IG03:
- jmp Array:IndexOf(ref,ubyte,int,int):int
- ;; bbWeight=1 PerfScore 2.00
-
-; Total bytes of code 14, prolog size 0, PerfScore 5.90, (MethodHash=17a0d3b5) for method SZArrayHelper:IndexOf(ubyte):int:this
-; ============================================================
-
-Unwind Info:
- >> Start offset : 0x000000 (not in unwind data)
- >> End offset : 0xd1ffab1e (not in unwind data)
- Version : 1
- Flags : 0x00
- SizeOfProlog : 0x00
- CountOfUnwindCodes: 0
- FrameRegister : none (0)
- FrameOffset : N/A (no FrameRegister) (Value=0)
- UnwindCodes :
; Assembly listing for method SZArrayHelper:IndexOf(short):int:this
; Emitting BLENDED_CODE for X64 CPU with AVX - Unix
; optimized code crossgen $ jitutils/bin/jit-analyze --base /tmp/diffs/base1 --diff /tmp/diffs/diff1
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: 0 (0.000% of base)
0 total files with Code Size differences (0 improved, 0 regressed), 265 unchanged.
0 total methods with Code Size differences (0 improved, 0 regressed), 182131 unchanged. |
Looks great! Thank you! |
@am11 sorry for the rough edges running diffs. Need to find some time to work on dotnet/jitutils#217. FYI @dotnet/jit-contrib -- another GUID change. |
* Convert math intrinsics to named intrinsics * Annotate Floor and Ceiling with [Intrinsic]
Contributes to #13820.