Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issues with the java bytecode frontend #742

Merged
merged 14 commits into from
Nov 8, 2023

Conversation

Timbals
Copy link
Contributor

@Timbals Timbals commented Nov 3, 2023

Sorry for the big PR. The bytecode fronted has a lot of shared state that makes it difficult to change in small steps.

Moved the logic for assigning the values of operands to stack locals out of OperandStack and into Operand.
That means that the assignments are created when the operands are used as a local/immediate with the toLocal and toImmediate methods instead of when they are popped of the stack with popLocal and popImmediate.
This fixes some issues where the wrong pop method was used which caused unnecessary stack local variables to get created.
This also results in improved type-safety and gets rid of a bunch of casts, because toLocal and toImmediate actually return the Local and Immediate types instead of a generic Operand.

Significantly simplified merging logic in StackFrame.

All convert methods had a code path for the first time they were called and for merging subsequent calls.
These two code paths have been combined now.

Factored a bunch of duplicated code into the new changeStackLocal in Operand.
The old version actually incorrectly merged operands at times (See SwitchExprWithYieldTest).

This code

int k = 1;
String s = "";

switch (k) {
  case 1:
    s += "single";
  default: // note the fall-through
    s += "somethingElse";
};

would previously produce

$l1 = 1
$l2 = ""
$stack8 = $l1
switch($stack8)
case 1: goto label13
default: goto label14
label13:
$l2 = dynamicinvoke "makeConcatWithConstants" <java.lang.String (java.lang.String)>($l2) <java.lang.invoke.StringConcatFactory: java.lang.invoke.CallSite makeConcatWithConstants(java.lang.invoke.MethodHandles$Lookup,java.lang.String,java.lang.invoke.MethodType,java.lang.String,java.lang.Object[])>("\u0001single")
label14:
$stack4 = $l2
$stack5 = dynamicinvoke "makeConcatWithConstants" <java.lang.String (java.lang.String)>($l2) <java.lang.invoke.StringConcatFactory: java.lang.invoke.CallSite makeConcatWithConstants(java.lang.invoke.MethodHandles$Lookup,java.lang.String,java.lang.invoke.MethodType,java.lang.String,java.lang.Object[])>("\u0001somethingElse")
// there is a bug here: `$l2 = $stack5` is missing

but now produces

$l1 = 1
$l2 = ""
switch($l1)
case 1: goto label13
default: goto label14
label13:
$l2 = dynamicinvoke \"makeConcatWithConstants\" <java.lang.String (java.lang.String)>($l2) <java.lang.invoke.StringConcatFactory: java.lang.invoke.CallSite makeConcatWithConstants(java.lang.invoke.MethodHandles$Lookup,java.lang.String,java.lang.invoke.MethodType,java.lang.String,java.lang.Object[])>(\"\\u0001single\")
label14:
$l2 = dynamicinvoke \"makeConcatWithConstants\" <java.lang.String (java.lang.String)>($l2) <java.lang.invoke.StringConcatFactory: java.lang.invoke.CallSite makeConcatWithConstants(java.lang.invoke.MethodHandles$Lookup,java.lang.String,java.lang.invoke.MethodType,java.lang.String,java.lang.Object[])>(\"\\u0001somethingElse\")

This adds the missing assignment and doesn't produce so many stack local variables.

Fixed updating usages of the value/stackLocal in Operand. This is now part of the changeStackLocal method.

Simplified duplication instructions (Thanks to Operand.DWORD_DUMMY).

Removed some commented out code.

Fixed an issue in the ReplaceUseStmtVisitor that would produce stale results.

Fixes #698
Fixes #631

TODO:

  • rebase/merge to newest develop
  • fix tests
  • add comments/documentation on how the bytecode frontend works for the next person that looks at this code
  • don't create stack locals when a normal local could be used
  • rename/rework StackFrame (it has nothing to do with stack frames)
  • remove duplicated logic in each instruction conversion method for creating and merging the instruction
  • don't merge operands when they all have the same value
  • find out what addReadOperandAssignments_internal does and maybe remove it
  • improve statement position info for operands and statements

Certain assign statement would produce in no result. Because the visitor is stateful, this would result in an old result being used later.
Moved the logic for assigning the values of operands to stack locals out of `OperandStack` and into `Operand`.
That means that the assignments are created when the operands are used as a local/immediate with the `toLocal` and `toImmediate` methods instead of when they are popped of the stack with `popLocal` and `popImmediate`.
This fixes some issues where the wrong `pop` method was used which caused unnecessary stack local variables to get created.
This also results in improved type-safety and gets rid of a bunch of casts, because `toLocal` and `toImmediate` actually return the `Local` and `Immediate` types instead of a generic `Operand`.

Significantly simplified merging logic in `StackFrame`.
Factored a bunch of duplicated code into the new `changeStackLocal` in `Operand`.
The old version actually incorrectly merged operands at times (See `SwitchExprWithYieldTest`). I will add a test for this in a later commit.

Fixed updating usages of the `value`/`stackLocal` in `Operand`. This is now part of the `changeStackLocal` method.

Simplified duplication instructions (Thanks to `Operand.DWORD_DUMMY`).

Removed some commented out code.
This removes the duplicated logic that was present in *every* convert method with any inputs/outputs.
@Timbals Timbals force-pushed the fix/java-bytecode-frontend branch from 557d2b3 to 8aa2a08 Compare November 6, 2023 20:07
…merging

Instead, rely on the fact that all `convert` methods override their respective statements.
That means just changing the incoming operands will automatically correctly update the statement.

Also added a test case to make sure this works with the special case of using a normal local as a stack local.
When converting the instructions for a branch, there might not be a line number node after the branch target.
This would result in incorrect line numbers being used until the next line number node.

The fix is to simply store the line number for branches and restore the current line number when handling that branch.
@Timbals Timbals force-pushed the fix/java-bytecode-frontend branch from 10740b2 to e7e565e Compare November 8, 2023 15:27
@Timbals Timbals marked this pull request as ready for review November 8, 2023 15:30
Copy link
Collaborator

@swissiety swissiety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks excellent - thx for cleaning that up ❤️

@swissiety swissiety merged commit f2e1465 into soot-oss:develop Nov 8, 2023
10 of 12 checks passed
@Timbals Timbals deleted the fix/java-bytecode-frontend branch January 19, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Undefined variable but used AsmMethodSource produces code with undefined but used variables
2 participants