[WIP][GR-45250][GR-45734] Reachability proofs for reflective operations #11079

graalvmbot · 2025-04-24T09:20:49Z

Currently, the constant reflection analysis used by Native Image is optimization dependent. This can lead to unexpected results during image run-time when using reflection. For example, the Class.forName call in the following snippet will be folded by the analysis:

static boolean isEven(int n) {
    return n % 2 == 0;
}

Class<?> grabClass() throws ClassNotFoundException {
    var className = isEven(4) ? "A" : "B";
    return Class.forName(className); // returns Class A
}

However, adding a simple printing statement to isEven or toggling different optimizations during build-time can cause the method to be non-inlinable and Class.forName call won't be folded:

static boolean isEven(int n) {
    System.out.print("isEven was called");
    return n % 2 == 0;
}

Class<?> grabClass() throws ClassNotFoundException {
    var className = isEven(4) ? "A" : "B";
    return Class.forName(className); // throws ClassNotFoundException / MissingReflectionRegistrationError
}

In order to prevent this behavior, we can run a constant reflection analysis directly on the bytecode provided by JVMCI objects.

Analysis specification

For each instruction of the method, the analysis will record the state of the operand stack and the local variable table prior to the execution of that instruction. Each value on the operand stack and each local variable value in the local variable table is either marked as not a compile time constant or as a compile time constant, in which case it has an abstract representation in the form of a pair <source BCI, inferred value>. The source BCI represents the BCI of the instruction that pushed that value on the operand stack or stored it in the local variable table, while the inferred value represents the actual value which would be placed on the operand stack or in the local variable table during runtime execution, as inferred by the analysis. Further, we distinguish between non-array type and array type compile time constants.

Each instruction of the method is assigned a changed bit. Initially, the changed bit is set only for the first instruction of the method. The operand stack corresponding to the first instruction is empty and the local variables corresponding to the method's parameters are marked as not a compile time constants.

The following steps are then repeated in a loop:

Select an instruction with a set changed bit. If no such instruction exists, exit the loop and return the last modeled state of the operand stack and local variable table corresponding to each instruction. Otherwise, the changed bit is turned off for the selected instruction.
Model the effects of the selected instruction by changing the state of the current operand stack and local variable table in the following manner:
- If the selected instruction pops values off of the operand stack, the appropriate number of source values are popped off of the modeled operand stack.
- If the selected instruction pushes a constant value on the operand stack (ACONST_NULL, ICONST_M1, ICONST_0, ..., ICONST_5, LCONST_0, LCONST_1, FCONST_0, ..., FCONST_2, DCONST_0, DCONST_1, BIPUSH, SIPUSH, LDC, LDC_W, LDC2_W), an abstract non-array type compile time constant <BCI, value> is pushed on the modeled operand stack, where BCI represents the bytecode offset of the selected instruction, and value the actual value it pushes onto the operand stack.
- If the selected instruction is a GETSTATIC instruction referencing the TYPE field in any of the primitive type wrapper classes (java.lang.Integer, java.lang.Double, ...), a non-array type compile time constant <BCI, primitive class> is pushed on the modeled operand stack, where BCI represents the bytecode offset of the selected instruction, and primitive class the appropriate class object.
- If the selected instruction modifies a local variable (for example, with an ASTORE instruction), an abstract value is assigned to that variable in the modeled local variable table in the following way:
  - If the abstract value that is being assigned to the variable (in the case of ASTORE, its operand) is not a compile time constant, then so is the abstract value assigned to the variable.
  - If the abstract value that is being assigned to the variable is a non-array type compile time constant <BCI, value>, then a non-array type compile time constant <new BCI, value> is assigned to the variable, where new BCI represents the bytecode offset of the selected instruction.
- If the selected instruction loads a local variable table (for example, with an ALOAD instruction), an abstract value is pushed on the operand stack in the following way:
  - If the abstract value referenced by the load instruction is not a compile time constant, then neither is the abstract value pushed on the operands tack.
  - If the abstract value referenced by the load instruction is a non-array type compile time constant <BCI, value>, then a non-array type compile time constant <new BCI, value> is pushed on the operands tack, where new BCI represents the bytecode offset of the selected instruction.
- If the selected instruction is a java.lang.Class.forName(String className) invocation, an abstract value is pushed on the operand stack in the following way:
  - If the operand corresponding to the className parameter is not a compile time constant, then neither is the abstract value pushed on the operand stack.
  - If the operand corresponding to the className parameter is a compile time constant <BCI, class name>, the resolution of the targeted class is attempted. If the class can be found, a compile time constant <new BCI, class> is pushed onto the stack, where BCI represents the bytecode offset of the selected instruction, and class the appropriate class object. Otherwise, a not a compile time constant abstract value is pushed on the stack.
- If the selected instruction is ANEWARRAY, an abstract value is pushed on the operand stack in the following way:
  - If the count operand is a non-array type compile time constant <BCI, integral value>, then an array type compile time constant <new BCI, empty array> is pushed on the operand stack, where new BCI represents the bytecode offset of the selected instruction, and empty array an array of length integral value with all element values set to null.
  - If the count operand is not a compile time constant, then so is the abstract value pushed on the stack.
- If the selected instruction is AASTORE and its arrayref operand is an array type compile time constant <BCI, array value>, the state of the modeled operand stack is modified in the following way:
  - If the index operand or the value operand of the instruction is not a compile time constant, all of the array type compile time constants <BCI, array value> on the selected instruction's modeled operand stack are marked as not a compile time constant.
  - If the index operand is a compile time constant <index BCI, element index> and the value operand is a compile time constant <element BCI, element value>, all of the array type compile time constants <BCI, array value> on the selected instruction's modeled operand stack are transformed to array type compile time constants <new BCI, new array value>, where new BCI represents the bytecode offset of the selected instruction and new array value is obtained by setting the element of array value at index element index to value element value.
- If the selected instruction is ASTORE, PUTSTATIC or PUTFIELD, and their operands include an array type compile time constant <BCI, array value>, then all of the array type compile time constants <BCI, array value> on the selected instruction's modeled operand stack are marked as not a compile time constant. If the instruction was ASTORE, the stored abstract value is also not a compile time constant.
- If the selected instruction is a method invocation (INVOKEVIRTUAL, INVOKESPECIAL, INVOKESTATIC, INVOKEINTERFACE, INVOKEDYNAMIC), and their operands include an array type compile time constant <BCI, array value>, then all of the array type compile time constants <BCI, array value> on the selected instruction's modeled operand stack are marked as not a compile time constant.
- Instructions which push values on the operand stack or modify the local variable table, but were not mentioned in the previous rules, produce a not a compile time constant abstract value.
Determine the successors of the selected instruction. Successor instructions can be either of the following:
- The next instruction, if the current instruction is not an unconditional control transfer instruction.
- The targets of an unconditional or conditional control transfer instruction.
- Exception handlers for the selected instruction.
Merge the state of the current operand stack and local variable table into each of the successor instructions (In the special case of control transfer to an exception handler, the operand stack is set to contain a single not a compile time constant abstract value):
- If this is the first time the successor instruction has been visited, it is assigned the operand stack and local variable table calculated in step 2. The changed bit for that instruction is set.
- If the successor instruction has already been visited previously, the operand stack and local variable table calculated in step 2 is merged into its operand stack and local variable table. The changed bit of that instruction is set if there are any changes to the abstract values after merging. Values are merged for matching values, further left and right, on the operand stack and in the local variable table in the following way:
  - If either left or right are not a compile time constant, then so is the merged value.
  - If left is a compile time constant <left BCI, value> and right is a compile time constant <right BCI, value>, the result is a compile time constant <BCI, value> iff left BCI = right BCI = BCI. Otherwise, the merged value is not a compile time constant.
Repeat from step 1.

The invocation of a reflective method (depending on the method, an INVOKESTATIC or INVOKEVIRTUAL instruction) can then be inferred iff all of its operands on the modeled operand stack are compile time constants.

In terms of Java code, the bytecode analysis specification roughly translates to the following rules for what is considered a compile time constant expression:

Constant expressions, as defined in §15.28
Class literals, for example SomeClass.class or int.class
Names which refer to local variables of non-array type and are dominated by an assignment of a compile time constant expression to that variable
A name N referring to a local variable V is dominated by an assignment A if and only if:
- All the paths from the method's entry point to N contain A
- No paths from A to N contain another assignment A' to variable V

String className = "A";
if (someCondition(1, 2)) {
	className = "B";
	Class.forName(className); // Automatically resolved - returns Class B
}
Class.forName(className); // Can't be automatically resolved - className isn't dominated

Direct array initializations where every element is a compile time constant expression

Integer.class.getMethod("parseInt", new Class<?>[]{String.class, int.class});

java.lang.Class.forName(String className) invocations where the className argument is a compile time constant

Preliminary results

Preliminary results we're getting from running our analysis on Spring PetClinic:

Method name	Graph analysis folds	Bytecode analysis folds	Missed folds
java.lang.Class.forName(String)	61	60	1
java.lang.Class.getConstructor(Class[])	3	2	1
java.lang.Class.getDeclaredConstructor(Class[])	5	4	1
java.lang.Class.getMethod(String, Class[])	33	28	5
java.lang.Class.getDeclaredMethod(String, Class[])	19	18	1
java.lang.Class.getField(String)	0	0	0
java.lang.Class.getDeclaredField(String)	16	16	0
java.lang.Class.getFields()	2	2	0
java.lang.Class.getDeclaredFields()	2	2	0
java.lang.Class.getMethods()	1	1	0
java.lang.Class.getDeclaredMethods()	3	3	0
java.lang.Class.getConstructors()	0	0	0
java.lang.Class.getDeclaredConstructors()	0	0	0
java.lang.Class.getRecordComponents()	0	0	0
java.lang.Class.getPermittedSubclasses()	0	0	0
java.lang.Class.getNestMembers()	0	0	0
java.lang.Class.getClasses()	0	0	0
java.lang.Class.getDeclaredClasses()	0	0	0
java.lang.Class.getSigners()	0	0	0
Total	145	136	9

…vided method bytecode.

[GR-45250][GR-45734] Add framework for dataflow analysis on JVMCI pro…

beb617d

…vided method bytecode.

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][GR-45250][GR-45734] Reachability proofs for reflective operations #11079

[WIP][GR-45250][GR-45734] Reachability proofs for reflective operations #11079

graalvmbot commented Apr 24, 2025 •

edited by Aca-S

Loading

[WIP][GR-45250][GR-45734] Reachability proofs for reflective operations #11079

Are you sure you want to change the base?

[WIP][GR-45250][GR-45734] Reachability proofs for reflective operations #11079

Conversation

graalvmbot commented Apr 24, 2025 • edited by Aca-S Loading

Analysis specification

Preliminary results

graalvmbot commented Apr 24, 2025 •

edited by Aca-S

Loading