-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JIT] Cleanup to lsra inspired by #73424 #76481
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsInspired by comments (here and here)
|
Redo of #73589 which was auto-closed |
I verified no-diff (other than memory addresses) on jitdump for a test case with minopts on and off to exercise the dump code in lsra. |
@jakobbotsch PTAL (since you commented on the first version of this), cc @dotnet/jit-contrib |
It looks like there are some diffs, not expected I assume? Also the formatting jobs aren't happy. |
Yes, I will look. I'm not sure if the churn here is making a case for leaving the code alone or strengthening the original case asking for simplification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jit-analyze
output doesn't show up. @jakobbotsch do you know why?
break; | ||
} | ||
if (!continueLoop) | ||
{ | ||
// Avoid loop iteration that will update currentRefPosition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just keep the break
that is at the beginning of the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complicated expressions were one of the motivations for the PR (the &
iterator thing was the other). When I looked at simplifying them, I noticed that a few included the same set of cases as the logic in the rest of the loop (in this case, the list of RefTypeExpUse, RefTypeDummyDef, and RefTypeBB appears twice), so I tried to combine them.
The printf
in the default case was another clue - it seems to be more like an assertion that we shouldn't get there because the original check is the same as the switch.
I'm open to suggestions for other ideas. That I have some asm diffs both encourages me to do the cleanup since I must have misread the code somewhere but also worries me a bit about churn. Is there an easy way to verify that #73424 didn't have any asm diffs? (The job seems to be gone)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense and cleaner. I was just curious about the reasoning.
[10:03:16] Total bytes of delta: -36 (-0.00% of base)
[10:03:16] Generated asm is located under C:\h\w\A34A08D0\p\artifacts\spmi\asm.coreclr_tests.run.Linux.arm64.checked\base C:\h\w\A34A08D0\p\artifacts\spmi\asm.coreclr_tests.run.Linux.arm64.checked\diff
[10:03:16] No textual differences. Is this an issue with coredistools?
[10:03:16] 88 contexts with differences found (1 improvements, 0 regressions)
Hmm, it is a combination of #53773 and my recent change in #76238 exacerbating this problem. I think that during full replay of all contexts, we are more likely to see the problem #53773, while when we create individual disassembly files we are less likely to see it (due to less churn on the memory allocator and so on). So since we are basing the diff numbers on the first run only now we see the problem more often. @EgorBo do you think there is a chance we can merge #76112 with the main functionality disabled? Or merge the part of it that fixes #53773? |
@markples I don't think you need to worry much about those diffs then, they probably are just spurious. I'd double check the asmdiffs artifacts to check that the diffs look to be around field addresses and run some extra runs locally to check the diffs there. |
@markples - jit format errors are fixed by running the following command locally and then committing the modified files.
|
I see something different between the runs (got another when I pushed the source formatting commit, not sure if I've caused myself problems by rebasing the branch to a more current main..) In the first one, all of the base/diff directories are identical even though the report shows a few size differences. For example, Linux.arm64 shows -36 bytes and 88 "contexts with diffs". Strangely, its base/diff only has 49 pairs of .dasm files. in a more recent one, the size difference is gone, but now I see some line ordering changes and the first numeric one is The first run is confusing to me, but I guess I should ignore it. I'll look at the new one, especially the line order. |
Well looks like I was too quick, I completely skipped that you were quoting me :-) I think you see identical disassembly because SPMI works in two steps: first it does a run over all contexts, finding diffs. Here it allocates a lot of memory and thus static field addresses are going to have higher probability of being different simply by how the allocator is perturbed. In the second step, it creates disassembly for contexts with diffs. This is done with single invocations of superpmi on the contexts that it found in step 1. So the memory addresses it sees here are probably more likely to be equal since there's less noise in the environment. The end result is that things do not look internally consistent since SPMI sees different disassembly in step 1 and step 2. |
Can you show examples of the arm64 diffs (and note which MCH file they came from)? My change #76616 should fix some of the spurious diffs. |
@BruceForstall The code ones so far look like this:
which is similar to the sequence in that link. The line ordering seems scarier, though now I'm seeing that both occur near movz/movk sequences. On the other hand, it would be hard to be far from those in that function.
|
@BruceForstall Sorry, I left this out - I think this is |
@jakobbotsch I plan to merge it this week, the main issue that blocked it has been resolved |
Test failures:
|
eed7dca
to
af8b1d1
Compare
apologies for the force push - in auditing the change before merging, I discovered that I'd resolved a merge conflict backwards and redoing it was the easiest path forward |
Throughput: Linux x64 Overall (-0.03%)
MinOpts (-0.05%)
FullOpts (-0.03%)
windows x86 Overall (-0.02%)
MinOpts (-0.04%)
FullOpts (-0.02%)
|
Inspired by comments (here and here)
RefPosition* currentRefPosition = &reverseIterator
as it creates two variables that are essentially the same, especially since the iterator has operators like->
. In fact, I had to spend a bit of time verifying that there weren't any updates to one but not the other in the code. (And actually Fix undefined behavior found with g++-12 ubsan #73424 did slightly impact this relationship since it moved the assignment from the loop iteration step to the loop body, but I don't believe that this mattered.)I renamed the "iterator" variable to the "position" name to reduce textual churn in the code. This didn't work as well for the range-based loops since they yield -references- to the underlying object so a bunch of C++ punctuation changes.
for
conditions and remove duplication between the condition and the loop body. Search forcontinueLoop
to see them.operator&
on each iterator since they are no longer used and not part of normal iterators.Manually verified no diffs (other than memory addresses) on a jitdump for a test case with minopts on and off to exercise the dump code in lsra. Manually verified asm diffs were spurious (differences in runtime addresses, sometimes also causing changes in the compiler internal representation size and then locations of labels).
This regains the unexplained small throughput loss in #73424.