-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve control flow decompilation (fixes #1133) #1176
Conversation
Add a ControlFlowSimplification step after SplitVariables Enable dead code removal in some unit tests
I haven't had time to look at the code yet; but I compared the decompiler output with a few test assemblies, and it's clear that this is a massive improvement. Thanks! Total number of |
That's great to hear. Sorry it took so long, I had to put it on hold for a week and a half as finals hit for uni. I'm not sure what your release schedule is, but there's two more pieces coming, related to switch reconstruction, one of which I can have done in 2 days time. You'll notice that part of the new ConditionDetection involves guessing Branch targets which are continue keywords, as those have different ordering priorities. I'll need to do something similar to switch detection, which means separating it from and moving it after LoopDetection. This would produce another BlockTransform phase, and I don't understand the system enough to know the performance or flow impacts yet. What would you suggest? |
Oh right, switch blocks can have two exits ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall: I like the approach, and think it works great.
Thanks a lot for your work here! I'll merge the PR and add a commit of my own on top of it (with purely cosmetic changes I made during the review).
Most of the remaining goto
s in the round trip tests are in some way related to switch
. There seem to be some bugs in the logic that determines the switch body (particularly with nested switches); but there's also a lot of tricky cases where we over-aggressively detect a switch and cause problems for later pipeline stages. I have no idea on how to solve those (though we could improve things somewhat by fine-tuning the sparse-switch heuristic).
Some other goto
s are caused by us not inlining generated value temporaries (keyword: STRUCT_SPLITTING_IMPROVED
), which in turn makes it impossible to reconstruct the short-circuiting operators. Properly solving that will first require support for ref-like types...
I didn't see any where the goto
would have been avoidable if ConditionDetection
made better choices.
@@ -106,7 +106,8 @@ public void ShortCircuit([ValueSource("defaultOptions")] CSharpCompilerOptions c | |||
public void ExceptionHandling([ValueSource("defaultOptions")] CSharpCompilerOptions cscOptions) | |||
{ | |||
RunForLibrary(cscOptions: cscOptions, decompilerSettings: new DecompilerSettings { | |||
NullPropagation = false | |||
NullPropagation = false, | |||
RemoveDeadCode = !cscOptions.HasFlag(CSharpCompilerOptions.UseRoslyn) | |||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoveDeadCode
was introduced for F# (which emits much more dead code than C#).
It's not enabled by default, so it feels weird to have to use it in the C# tests.
As far as I can tell, it's only for while (true)
loops compiled with legacy csc in debug mode?
I guess it's fine to special-case this for the test case... but maybe it would be better to have a transform that removes this specific kind of dead store independent of the RemoveDeadCode
configuration option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A transform would be better, I'll leave that one for you
return forIncrement; | ||
} catch (InvalidOperationException) { | ||
// multiple potential increment blocks. Can get this because we don't check that the while loop | ||
// has a condition first, as we don't need to do too much of HighLevelLoopTransform's job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetIncrementBlock
should be changed to return null instead of throwing an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally had it return null, but the code looked ugly in comparison, and wasn't necessary inside HighLevelLoopTransform
because of preconditions. Avoiding exceptions as part of normal control flow is a design principle I'm aware of, so I'll favour that.
//save a copy | ||
var trueInst = ifInst.TrueInst; | ||
|
||
if (ifInst != block.Instructions.SecondToLastOrDefault()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving instructions from the current block into a nested block is potentially problematic, because the following block transforms apply only to the current block, any nested blocks are expected to be already completely transformed.
But I guess it's OK in this case because it's only moving instructions that were previously inlined from another block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, works out for us here.
if (exitInst is Branch branch | ||
&& branch.TargetBlock.Parent == currentContainer | ||
&& branch.TargetBlock.IncomingEdgeCount == 1 | ||
&& branch.TargetBlock.FinalInstruction is Nop) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the old ConditionDetection
was originally written, we strengthened the Block
invariants to allow a non-Nop FinalInstruction
only for the inline block types:
branch.TargetBlock
must be directly within a container (viaBranch
invariant)- Any block in a container must have type
ControlFlow
(viaBlockContainer
invariant) ControlFlow
blocks can't have aFinalInstruction
(viaBlock
invariant)
So these checks are redundant now. I'll modify the code to replace them with an assertion.
Although, this doesn't hold for most blocks being checked, since the ThenInst of a IfInstruction could theoretically be an expression represented by an inline block (a ? new C { ... } : ...
).
int falseInstIndex = block.Instructions.IndexOf(ifInst) + 1; | ||
AddExits(block, falseInstIndex, elseExits); | ||
|
||
var commonExits = elseExits.Where(e1 => thenExits.Any(e2 => DetectExitPoints.CompatibleExitInstruction(e1, e2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quadratic search. Could be optimized using a HashSet with a CompatibleExitInstructionComparer...
But I guess usually there won't be that many exits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, clearer code first, optimisations if obvious or profiling hotspot, generally not that many exits (especially since it's only looking for exits which could be moved to the block root via inversions.
Thanks for this, switch detection, particularly over-aggressive |
A solution to #1133 involving repeated inlining, and block exit merging with appropriate keyword prioritisation