-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8373495: C2: Aggressively fold loads from objects that have not escaped #28812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
👋 Welcome back qamai! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@merykitty The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
|
Very nice! I definitely prefer the approach here to #28764. I see that the unit test stays the same and there's an adjustment in some other test, so I assume this version is functionally more powerful than #28764 version. Have you had a chance to measure how much it affects compilation speed compared to #28764? (The code is dense and hard to reason about, so some polishing/refactoring to make it more readable. Also, please, think about verification checks.) |
|
@iwanowww Thanks for your comment. I have added a lot more comments to explain in detail the steps of
I have also added some verification that if a step determines that For the runtime cost, I don't see a noticeable difference compared to master. For the unit test, compared to the previous PR, I have removed the |
|
I have made further changes that I believe have made the change pretty rigorous, I don't think I can see any flaw in the reasoning that allows mis-analysis now. |
vnkozlov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable at first glance but I need more time to go through.
|
Looks interesting @merykitty! I will also review this. |
|
I have added a section describing some future work based on this PR that I have come up with. |
|
@merykitty would it be possible to guard the logic added by this patch with a new diagnostic flag, to facilitate reviewing and experimenting? |
|
@robcasloz Done, is it good for you? |
Hi,
This patch is an alternative to #28764 but it does the analysis during IGVN instead.
The current PR:
The current escape analysis mechanism is all-or-nothing: either the object does not escape, or it does. If the object escapes, we lose the ability to analyse the values of its fields completely, even if the object only escapes at return.
This PR tries to find the escape status of an object at a load, and if it is decided that the object has not escaped there, we can try folding the load aggressively, ignoring calls and memory barriers to find a corresponding store that the load observes. Implementation-wise, when walking at
find_previous_store, if we encounter a call or memory barrier, we start looking at all nodes that make the allocation escape. If all such nodes have a control input that is not a transitive control input of the call/barrier we are at, then we can decidedly say that the allocation has not escaped at that call/barrier, and walk past that call/barrier to find a corresponding store.I do not see a noticeable difference in C2 runtime with and without this patch.
Future work:
Consider this case:
Currently,
owill be considered escaped ath.o = o. However, it can be seen thatohas not actually escaped becausehhas not escaped. Luckily, with the current approach, this can be easily achieved, notice how this loop is just "if anything escapes, considerbaseescapes", currently, the "anything" here includesbaseand its aliases. if we include the base of the object at whichois stored, then we can correctly determine ifohas escaped.Phi.This is pretty straightforward. We need to create a value
Phifor each memoryPhiso that we can handle loopPhis.Phi.This can be easy, just give up if we don't encounter a store into that
Phi. However, we can do better. Consider this case:Then,
ashould be able to be folded toPhi(v1, v2)ifp1andp2are known not to alias.Another interesting case:
Then, theoretically, we can fold
atoPhi(v1, v)ifp1andp2are known not to alias.Please take a look and leave your thoughts, thanks a lot.
Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28812/head:pull/28812$ git checkout pull/28812Update a local copy of the PR:
$ git checkout pull/28812$ git pull https://git.openjdk.org/jdk.git pull/28812/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28812View PR using the GUI difftool:
$ git pr show -t 28812Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28812.diff
Using Webrev
Link to Webrev Comment