Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8320308: C2 compilation crashes in LibraryCallKit::inline_unsafe_access #20033

Closed
wants to merge 22 commits into from

Conversation

tobiasholenstein
Copy link
Member

@tobiasholenstein tobiasholenstein commented Jul 4, 2024

We failed in LibraryCallKit::inline_unsafe_access() while trying to inline Unsafe::getShortUnaligned.

return run ? UNSAFE.getShortUnaligned(array, 1) : 0; // after warmup CheckCastPP: speculative=byte[int:>=0]

The reason is that base (the array) is ConP #null hidden behind two CheckCastPP with speculative=byte[int:>=0]

We call Node* adr = make_unsafe_address(base, offset, type, kind == Relaxed);

Node* adr = make_unsafe_address(base, offset, type, kind == Relaxed);

  • with base = 147 CheckCastPP
  • 118 ConP === 0 [[[ 106 101 71 ] #null
type

Depending on the offset we go two different paths in LibraryCallKit::make_unsafe_address which both lead to the same error in the end.

  1. For UNSAFE.getShortUnaligned(array, 1_049_000) we get kind = Type::AnyPtr because offset >= os::vm_page_size(). Since we assume base can't be null we insert an assert:

    base = must_be_not_null(base, true);

  2. whereas for UNSAFE.getShortUnaligned(array, 1) we get kind = Type:: OopPtr

    int kind = classify_unsafe_addr(uncasted_base, offset, type);

    and insert a null check
    base = null_check_oop(base, &null_ctl, true, true, true);

    In both cases we return call basic_plus_adr(..) on a base being top() which returns adr = 1 Con === 0 [[ ]] #top

const TypePtr* adr_type = _gvn.type(adr)->isa_ptr();
=> _gvn.type(adr) is top

Compile::AliasType* alias_type = C->alias_type(adr_type);
=> adr_type is nullptr

BasicType bt = alias_type->basic_type();
if (bt != T_ILLEGAL) {
=> BasicType bt is T_ILLEGAL

} else if (alias_type->adr_type()->isa_oopptr()) {
=> we fail here with SIGSEGV: null pointer dereference because alias_type->adr_type() is nullptr

Fix (updated on 18th Sep 2024)

The fix modifies the LibraryCallKit::classify_unsafe_addr() method to handle cases where the base might be hidden behind speculative type information.

In the original situation, null_check_oop() detects that the base value is NULL and transforms the check into an unconditional uncommon trap. The problem arises because LibraryCallKit::classify_unsafe_addr() sometimes fails to recognize this due to speculative typing, particularly when the base is hidden behind CheckCastPP nodes.

To resolve this, we add base->uncast() in classify_unsafe_addr(). The uncast() method strips away the speculative information from the base node, allowing the comparison against TypePtr::NULL_PTR to succeed. This ensures that when the base is NULL, the method can properly classify the address and avoid generating dead or incorrect code.

} else if (_gvn.type(base->uncast()) == TypePtr::NULL_PTR) {

In summary, the uncast() fix helps the method recognize when the base is NULL, aligning LibraryCallKit::classify_unsafe_addr() with the behavior of null_check_oop(), preventing errors when inlining unsafe accesses.

Testing: tier1-4 pass


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8320308: C2 compilation crashes in LibraryCallKit::inline_unsafe_access (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20033/head:pull/20033
$ git checkout pull/20033

Update a local copy of the PR:
$ git checkout pull/20033
$ git pull https://git.openjdk.org/jdk.git pull/20033/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20033

View PR using the GUI difftool:
$ git pr show -t 20033

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20033.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 4, 2024

👋 Welcome back tholenstein! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 4, 2024

@tobiasholenstein This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8320308: C2 compilation crashes in LibraryCallKit::inline_unsafe_access

Reviewed-by: thartmann, kvn, vlivanov, epeter, roland

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 104 new commits pushed to the master branch:

  • 83dcb02: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes
  • d2e7708: 8341367: Problemlist ShapeNotSetSometimes.java on macOS
  • 0314973: 8341060: Cleanup statics in HeapDumper
  • 021bf63: 8340458: Open source additional Component tests (part 2)
  • 9a7817b: 8340988: Update jdk/jfr/event/gc/collection tests to accept "CodeCache GC Threshold" as valid GC reason
  • f2a767f: 8340907: Open source closed frame tests # 2
  • 7b1e6f8: 8337389: Parallel: Remove unnecessary forward declarations in psScavenge.hpp
  • 2120a84: 8341333: [JVMCI] Export JavaThread::_unlocked_inflated_monitor to JVMCI
  • 684d246: 8341242: Shenandoah: LRB node is not matched as GC barrier after JDK-8340183
  • 7cc7c08: 8337493: [JVMCI] Number of libgraal threads might be too low
  • ... and 94 more: https://git.openjdk.org/jdk/compare/279086d4ce7e05972e099022e8045f39680dd4e8...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Jul 4, 2024

@tobiasholenstein The following labels will be automatically applied to this pull request:

  • graal
  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added graal graal-dev@openjdk.org hotspot-compiler hotspot-compiler-dev@openjdk.org labels Jul 4, 2024
@tobiasholenstein tobiasholenstein changed the title JDK-8320308: C2 compilation crashes in LibraryCallKit:: inline_unsafe_acces JDK-8320308: C2 compilation crashes in LibraryCallKit::inline_unsafe_access Jul 4, 2024
@openjdk openjdk bot changed the title JDK-8320308: C2 compilation crashes in LibraryCallKit::inline_unsafe_access 8320308: C2 compilation crashes in LibraryCallKit::inline_unsafe_access Jul 4, 2024
@openjdk
Copy link

openjdk bot commented Jul 8, 2024

⚠️ @tobiasholenstein This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@tobiasholenstein
Copy link
Member Author

/label remove graal

@openjdk openjdk bot removed the graal graal-dev@openjdk.org label Jul 9, 2024
@openjdk
Copy link

openjdk bot commented Jul 9, 2024

@tobiasholenstein
The graal label was successfully removed.

@tobiasholenstein tobiasholenstein marked this pull request as ready for review July 9, 2024 17:36
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 9, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 9, 2024

@vnkozlov
Copy link
Contributor

vnkozlov commented Jul 9, 2024

To understand what happens here. Did null check If control flow in make_unsafe_address() collapse and have only path to uncommon trap? This is what stopped() checks. In such case why it collapsed if NULL is hidden by CheckCastPP?

@iwanowww
Copy link
Contributor

iwanowww commented Jul 9, 2024

Even though the proposed check in LibraryCallKit::inline_unsafe_access() fixes the crash, IMO the root problem is in LibraryCallKit::classify_unsafe_addr() where base_type == TypePtr::NULL_PTR doesn't hold in presence of speculative part and the base is erroneously classified as on-heap (Type::OopPtr).

I'd prefer to see both places fixed. Seeing make_unsafe_address producing dead code signals about a bug. So, asserting that it never happens looks like a good idea.

Moreover, there are many places in the code susceptible to the same problem where a type is compared with TypePtr::NULL_PTR.

@tobiasholenstein
Copy link
Member Author

To understand what happens here. Did null check If control flow in make_unsafe_address() collapse and have only path to uncommon trap? This is what stopped() checks.

Yes, exactly

In such case why it collapsed if NULL is hidden by CheckCastPP?

While casting the base to not null (null_check_oop(..)) we insert 150 CmpP. The following stack-trace of _gvn.transform(chk) to Node::eqv_uncastdetermines that after stripping casting the two nodes are equivalent
must_be_not_null
Node::eqv_uncast(n, keep_deps=false) at node.hpp:500
SubNode::Value_common(phase) at subnode.cpp:91
SubNode::Value(phase) at subnode.cpp:101
PhaseGVN::transform(n) at phaseX.cpp:703
GraphKit::null_check_common(value, type, assert_null, null_control, speculative) at graphKit.cpp:1316
GraphKit::null_check_oop(value, null_control, never_see_null, safe_for_replace, speculative) at graphKit.cpp:2455
LibraryCallKit::make_unsafe_address(base, offset, type, can_cast) at library_call.cpp:2090
LibraryCallKit::inline_unsafe_access(is_store, type, kind, unaligned) at library_call.cpp:2361

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Sep 24, 2024
Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good :)

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 24, 2024
Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too.

So it would make sense to go over the uses of Type*::BOTTOM/Type*::NOTNULL and check they are not tested with pointer equality

What about this concern? Did anyone check yet or should be file a follow-up task?

…ullBase.java

Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Sep 25, 2024
@@ -2362,6 +2362,7 @@ bool LibraryCallKit::inline_unsafe_access(bool is_store, const BasicType type, c
SafePointNode* old_map = clone_map();

Node* adr = make_unsafe_address(base, offset, type, kind == Relaxed);
assert(!stopped(), "Inlining of unsafe access failed: address construction stopped unexpectedly");

if (_gvn.type(base)->isa_ptr() == TypePtr::NULL_PTR) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not uncast here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to add here as well. Done

@tobiasholenstein
Copy link
Member Author

Looks good to me too.

So it would make sense to go over the uses of Type*::BOTTOM/Type*::NOTNULL and check they are not tested with pointer equality

What about this concern? Did anyone check yet or should be file a follow-up task?

I filed a follow up task: https://bugs.openjdk.org/browse/JDK-8341023

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 26, 2024
@tobiasholenstein
Copy link
Member Author

Thanks @TobiHartmann , @rwestrel , @eme64, @vnkozlov and @iwanowww for the reviews!

If @iwanowww is ok with the changes, this PR is ready to integrate.

I will delegate since I am out of office the next 3 weeks.

/integrate delegate

@openjdk openjdk bot added the delegated label Sep 27, 2024
@openjdk
Copy link

openjdk bot commented Sep 27, 2024

@tobiasholenstein Integration of this pull request has been delegated and may be completed by any project committer using the /integrate pull request command.

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@iwanowww
Copy link
Contributor

iwanowww commented Oct 1, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Oct 1, 2024

Going to push as commit 8d6d37f.
Since your change was applied there have been 104 commits pushed to the master branch:

  • 83dcb02: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes
  • d2e7708: 8341367: Problemlist ShapeNotSetSometimes.java on macOS
  • 0314973: 8341060: Cleanup statics in HeapDumper
  • 021bf63: 8340458: Open source additional Component tests (part 2)
  • 9a7817b: 8340988: Update jdk/jfr/event/gc/collection tests to accept "CodeCache GC Threshold" as valid GC reason
  • f2a767f: 8340907: Open source closed frame tests # 2
  • 7b1e6f8: 8337389: Parallel: Remove unnecessary forward declarations in psScavenge.hpp
  • 2120a84: 8341333: [JVMCI] Export JavaThread::_unlocked_inflated_monitor to JVMCI
  • 684d246: 8341242: Shenandoah: LRB node is not matched as GC barrier after JDK-8340183
  • 7cc7c08: 8337493: [JVMCI] Number of libgraal threads might be too low
  • ... and 94 more: https://git.openjdk.org/jdk/compare/279086d4ce7e05972e099022e8045f39680dd4e8...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 1, 2024
@openjdk openjdk bot closed this Oct 1, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review delegated labels Oct 1, 2024
@openjdk
Copy link

openjdk bot commented Oct 1, 2024

@iwanowww Pushed as commit 8d6d37f.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

6 participants