Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86 performance degradation due to vector register preservation #15716

Open
0xdaryl opened this issue Aug 16, 2022 · 9 comments
Open

x86 performance degradation due to vector register preservation #15716

0xdaryl opened this issue Aug 16, 2022 · 9 comments

Comments

@0xdaryl
Copy link
Contributor

0xdaryl commented Aug 16, 2022

A serious performance degradation (10%+) was reported by a user on several workloads running with JDK 8 on hardware with AVX 512 support. After some investigation, the problem was tracked to PR #14632 which preserves the full vector register set across helper calls (in this case it is preserving the full 32 ZMM registers * 64 bytes).

Due to another functional issue (#15011) this preservation code was isolated to JDK 17+ (PR #15072). While this fixes the performance issue for JDK 8 users the significant performance problems still remain for those running JDK 17+.

As this feature is only required to support the Vector API JEP on x86 it is recommended that the feature be disabled (or reverted). While this issue has been seen on Intel Skylake hardware and above, that hardware is ubiquitous enough that the problem is likely to be observed by a user.

Disabling the feature will be done in the master branch but I believe the problem is serious enough that it warrants inclusion in a 0.33.1 respin. Disabling the feature is relatively low risk and isolated to the x86 platform.

Once the feature is disabled, the plan is to investigate alternate schemes that perform better. For example, only preserve those vector registers that are live across helper calls (the number of live registers is expected to be small or zero in the vast majority of circumstances, especially in code that does not use the Vector API).

@DanHeidinga
Copy link
Member

@0xdaryl clarified we are preserving the registers but never actually using them in jit code / vm code. All cost, no benefit.

@pshipton
Copy link
Member

The 0.33.1 part of this is merged, I'll move the issue to the 0.35 milestone for anything remaining.

@hzongaro
Copy link
Member

Peter @pshipton, Brad @BradleyWood won't have any changes for this for the 0.35 milestone, so this can move out to the next release.

BradleyWood added a commit to BradleyWood/openj9 that referenced this issue Sep 28, 2022
This reverts PR eclipse-openj9#14632. A 10%+ performance
regression (issue eclipse-openj9#15716) reported on JDK 8
was isolated to eclipse-openj9#14632. Reverting this PR
should not have any function change, it
was only intended to support the Vector
API JEPs which have not yet been enabled
on x86 in OpenJ9. A previous attempt at
disabling just the extended vector register
preservation did not recoup the performance,
but only completely reverting this PR
restored the regression. This suggests
there is something with the way the helpers
have been restructured that is causing the
performance degradation. This PR will be
reverted while the cause is investigated.
BradleyWood added a commit to BradleyWood/openj9 that referenced this issue Sep 29, 2022
This reverts PR eclipse-openj9#14632. A 10%+ performance
regression (issue eclipse-openj9#15716) reported on JDK 8
was isolated to eclipse-openj9#14632. Reverting this PR
should not have any function change, it
was only intended to support the Vector
API JEPs which have not yet been enabled
on x86 in OpenJ9. A previous attempt at
disabling just the extended vector register
preservation did not recoup the performance,
but only completely reverting this PR
restored the regression. This suggests
there is something with the way the helpers
have been restructured that is causing the
performance degradation. This PR will be
reverted while the cause is investigated.
@vij-singh
Copy link

@vijaysun-omr FYI this is the issue I mentioned

@vij-singh
Copy link

@0xdaryl - I don't think we're attempting any of this for 0.38 so should we move this out to a later milestone?

@0xdaryl
Copy link
Contributor Author

0xdaryl commented Feb 8, 2023

Yes, though a 0.40 milestone hasn't been created yet.

@pshipton
Copy link
Member

pshipton commented Feb 8, 2023

@0xdaryl
Copy link
Contributor Author

0xdaryl commented Jun 5, 2023

Unlikely to be resolved for 0.40. Moving out.

@hzongaro
Copy link
Member

This is going to require more work than can be contained in the 0.41 release. Moving this forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants