Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify ROMclass packing to normalize more #18848

Merged
merged 2 commits into from
Feb 12, 2024

Conversation

cjjdespres
Copy link
Contributor

@cjjdespres cjjdespres commented Jan 30, 2024

When ROM classes are packed to be sent to the JITServer, we now perform two additional transformations, in addition to the existing string inlining:

  1. The return bytecodes in the ROM class are fixed up if the class is not shared.
  2. Any inlined debug info is stripped from the ROM class

Both of these are necessary to ensure that the packed representations of shared and non-shared versions of the same ROM class are identical. Otherwise, the difference in representations will cause them to have different hashes, and that will effectively prevent us from sharing code between JITServer clients that have different mixes of shared and non-shared classes.

@cjjdespres
Copy link
Contributor Author

Attn @mpirvu. This code is exercised by the normal operation of the JITServer, in that packROMClass is used to send the full ROM class to the server with or without an SCC. I've also tested in the context of #18301 that shared and non-shared versions of the same class will pack to hash-identical ROM classes with these changes, but that's difficult to exercise without that PR.

@mpirvu mpirvu self-assigned this Jan 31, 2024
@mpirvu mpirvu added the comp:jitserver Artifacts related to JIT-as-a-Service project label Jan 31, 2024
@cjjdespres
Copy link
Contributor Author

cjjdespres commented Feb 1, 2024

Maybe I should add that this will add a bit of overhead to ROM class packing that isn't strictly necessary at the moment. I could preserve the old implementation for shared ROM classes and have this new one on the side. Then the main PR can switch to this new one.

Also, while I skip fixing the bytecodes when the ROM class is shared, I do the debug info stripping even on shared ROM classes. That might not be totally necessary. I'll have to double-check if a shared ROM class can ever have a mix of inline and out-of-line debug info, but even then we could skip stripping the debug info in that case and and worst suffer a cache miss as a result. If that sounds all right then we can switch entirely to the new implementation right now and I'll just skip all the extra normalization if the ROM class is shared.

@cjjdespres
Copy link
Contributor Author

cjjdespres commented Feb 1, 2024

Looking at the ROM class creation code, it does seem like there are only two scenarios:

  1. The ROM class is stored in the SCC and the debug info will be entirely out-of-line
  2. The ROM class is not stored in the SCC, and the debug info may be inline or out-of-line

(Debug info (source code line numbers and things) will be totally out-of-line in a non-shared ROM class if the ROM class writer is instructed to throw away all debug info. That is actually a valid solution to this problem, but presumably we don't want to throw away all debug info just because we're running with -XX:+UseJITServerAOTCache.)

I was worried that a shared ROM class might have a mix, but I think the only time that the debug info will switch from out-of-line to inline in that case is if the SCC fills up, and if that happens then it looks like the ROM class will fail to be stored in the SCC and it will be rewritten to be non-shared.

@cjjdespres
Copy link
Contributor Author

I've changed the implementation so we only strip debug info and fix the return bytecodes for non-shared classes, and otherwise use the older method.

_segments.push_back({debugOffset + debugInfoSegmentSize, _accumulatedShift});
}

size_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments would be nice

Copy link
Contributor Author

@cjjdespres cjjdespres Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this a request for documentation for newOffsetFromOld, actually? I'll add it if so.

runtime/compiler/control/JITServerHelpers.cpp Show resolved Hide resolved
// Calculate the size of the portion of a ROM class that occurs before its UTF8 string section (the pre-string section),
// without any padding that may have been added during initial ROM class writing.
//
// Why is this necessary? The sections inside a ROMClass before the UTF8 string section are 32-bit aligned, but the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comment!

if (ctx->isInline(sectionPtr, romClass))
{
size_t endOffset = ((uint8_t *)sectionPtr - (uint8_t *)romClass) + sectionSize;
ctx->_preStringSize = std::max(ctx->_preStringSize, endOffset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how taking the max ensures alignment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't ensure alignment. We need to get the unaligned (unpadded) size of the ROM class before the string section, so that when we calculate the aligned (padded) size of the packed version of the ROM class here:

https://github.com/eclipse-openj9/openj9/pull/18848/files#diff-d88e87e324a1ed193ea9af7d85837cc118c61bdca9ae8380719b945bbffd0757R611

we don't introduce unnecessary padding.

I'll clarify that in the comment, and I should probably add that

  1. The "pre-string section" isn't a real section that the ROM class walk visits (neither is the UTF8 string section)
  2. The ROM class walker visits sections that are either entirely within the pre-string section or are entirely outside the ROM class. In particular, it doesn't visit anything in the UTF8 string section or the intermediate class data section.
  3. The ROM class walker visits every section before the UTF8 string section.

That's why taking the max actually gives you the size of the pre-string section, and why going through the trouble of taking the max is necessary in the first place. (Maybe "pre-string area" is a better term for it).

The assert here:

https://github.com/eclipse-openj9/openj9/pull/18848/files#diff-d88e87e324a1ed193ea9af7d85837cc118c61bdca9ae8380719b945bbffd0757R605

should hopefully catch any case where someone changes the ROM class format or walker significantly enough that those properties are violated.

@cjjdespres
Copy link
Contributor Author

Force-pushed to bring in latest merges and hopefully address the existing review comments.

@mpirvu
Copy link
Contributor

mpirvu commented Feb 7, 2024

jenkins test sanity plinuxjit,xlinuxjit,zlinuxjit,alinux64jit jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Feb 7, 2024

It's not clear to me why aarch64 build failed. The SDK was actually built.

@mpirvu
Copy link
Contributor

mpirvu commented Feb 7, 2024

jenkins test sanity alinux64jit jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Feb 8, 2024

Crash for xlinux:

 [ERR] Method_being_compiled=java/lang/System.getSysPropBeforePropertiesInitialized(I)Ljava/lang/String;
 [ERR] Target=2_90_20240207_156 (Linux 3.10.0-1160.92.1.el7.x86_64)
 [ERR] CPU=amd64 (4 logical CPUs) (0x1e8cbd000 RAM)
 [ERR] ----------- Stack Backtrace -----------
 [ERR] raise+0x2b (0x00007F99806CC4FB [libpthread.so.0+0xf4fb])
 [ERR] _ZN2TR4trapEv+0x47 (0x00007F997444C5A7 [libj9jit29.so+0x5d45a7])
 [ERR]  (0x00007F997444C632 [libj9jit29.so+0x5d4632])
 [ERR] _ZN16JITServerHelpers12packROMClassEP10J9ROMClassP9TR_MemoryP11TR_J9VMBaseRmm+0x93a (0x00007F997406D52A [libj9jit29.so+0x1f552a])
 [ERR] _ZN16JITServerHelpers22packRemoteROMClassInfoB5cxx11EP7J9ClassP10J9VMThreadP9TR_Memoryb+0x985 (0x00007F9974071D05 [libj9jit29.so+0x1f9d05])
 [ERR] _Z13remoteCompileP10J9VMThreadPN2TR11CompilationEP17TR_ResolvedMethodP8J9MethodRNS1_24IlGeneratorMethodDetailsEPNS1_28CompilationInfoPerThreadBaseE.localalias+0x32b (0x00007F997402FA2B 

@mpirvu
Copy link
Contributor

mpirvu commented Feb 8, 2024

P and Z have crashes too

@cjjdespres
Copy link
Contributor Author

Oh, the xlinux crash is:

15:13:47  Test result: FAILED
15:13:47   [ERR] JVMSHRC256I Persistent shared cache "ShareClassesCMLTests" has been destroyed
15:13:47   [ERR] Assertion failed at /home/jenkins/workspace/Build_JDK17_x86-64_linux_jit_Personal/openj9/runtime/compiler/control/JITServerHelpers.cpp:698: !(debugInfo->srpToVarInfo & 1)
15:13:47   [ERR] VMState: 0x0005ffff
15:13:47   [ERR] 	Shared ROM class                  has inline debug info

and that comes from my new fatal assert here:

https://github.com/eclipse-openj9/openj9/pull/18848/files#diff-d88e87e324a1ed193ea9af7d85837cc118c61bdca9ae8380719b945bbffd0757L355-R701

The ROM class name is blank as well, which is strange. Is there some odd interaction here with shutting down the local SCC?

@cjjdespres
Copy link
Contributor Author

The assert isn't technically necessary. I could keep the old behaviour ("if out of line, then zero out the SRP"). I just had it in there to validate my assumption that shared ROM classes only have out-of-line debug info. If that assumption is violated nothing bad should happen. It would just cause a hash mismatch, maybe.

@cjjdespres
Copy link
Contributor Author

I'm pretty sure that the romClass had to have been considered shared by packROMClass, because I also added the assert

   TR_ASSERT_FATAL(!isArrayROMClass(romClass) || (romClass->romMethodCount == 0), "Array ROM classes are expected not to have methods");

so we'd only be in the body of that loop if we considered the romClass in question to be shared.

@cjjdespres
Copy link
Contributor Author

I see, that test was run with -Xnolinenumbers. I know that influences how the debug info is written. I think all the failing tests had that option.

@cjjdespres
Copy link
Contributor Author

cjjdespres commented Feb 8, 2024

I understand, I think. With -Xnolinenumbers, the debug info area of the SCC isn't created. That means all debug info must be forced inline during ROM class writing, if any is to be saved for the ROM class.

I'm pretty sure, based on the reading of the ROM class writer code, that shouldPreserveLineNumbers() is false (because we're throwing away line numbers) and shouldPreserveLocalVariablesInfo() is true (we're still preserving local variable info) for the ROM class that caused the failure. That means we do in fact record debug info, and so it has to be written inline, because there's nowhere else for it to go.

Okay, so if I preserve the old behaviour then we won't be able to share code between these two types of clients:

  1. A client has a shared class relevant for deserialization/relocation, and that class is in an SCC layer that was created with -Xnolinenumbers
  2. A client has no such shared class (because the SCC wasn't created with -Xnolinenumbers, or the client has no SCC at all).

Is that all right for now? There might be an SCC property that I can query to see if we're in situation (1) and need to use the new packing implementation, to allow sharing between (1) and (2) in future.

When ROM classes are packed to be sent to the JITServer, we now
perform two additional transformations, in addition to the existing
string inlining:

1. The return bytecodes in the ROM class are fixed up if the class is
   not shared
2. Any inlined debug info is stripped from the ROM class

Both of these are necessary to ensure that the packed representations of
shared and non-shared versions of the same ROM class are identical.

Signed-off-by: Christian Despres <despresc@ibm.com>
@cjjdespres
Copy link
Contributor Author

cjjdespres commented Feb 8, 2024

Force-pushed to fix the issue discussed above. Based on our discussion, instead of a single needsDeepMethodNormalization I have two flags needsDebugInfoStripping and needsBytecodeFixing. The first controls whether or not the more elaborate debug info stripping procedure is performed, and is set to true if an initial walk of the ROM methods determines that any of the methods have inline debug info. The second is true exactly when the class is not shared and has at least one ROM method, and determines whether or not fixReturnBytecodes() is run on the packed ROM class.

This is (to summarize) necessary because a class being shared does not necessarily mean that its methods have only out-of-line debug info, so it being shared can't be used as a proxy to determine if we can skip debug info stripping.

This way also has the small benefit that we will now skip the more elaborate debug info stripping if we're packing a ROM class that isn't shared and has zero inline debug info.

@mpirvu
Copy link
Contributor

mpirvu commented Feb 8, 2024

jenkins test sanity all jdk21

Signed-off-by: Christian Despres <despresc@ibm.com>
@cjjdespres
Copy link
Contributor Author

Sorry, I was working on the main PR and realized that this had two issues. Neither were functional, but did affect hashing:

  1. The intermediateClassData WSRP needs to be zeroed out after adjustWSRP is run. Otherwise adjustWSRP gets confused.
  2. The needsBytecodeFixing test was not correct.

I decided to push a new commit in this PR rather than wait, because without it we were double-fixing the bytecodes of shared classes. That's not incorrect, just a waste of time.

@mpirvu
Copy link
Contributor

mpirvu commented Feb 8, 2024

jenkins test sanity all jdk21

@mpirvu
Copy link
Contributor

mpirvu commented Feb 9, 2024

jenkins test sanity plinuxjit,xlinuxjit,zlinuxjit,alinux64jit jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Feb 9, 2024

jenkins test extended,sanity.system,extended.system plinuxjit,xlinuxjit,zlinuxjit jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Feb 9, 2024

A bunch of tests fail on extended.system_x86-64_linux_jit because there is not enough disk space on the machines.
aarch64 and linux have a couple of criu-restore failures that we have seen before.

@mpirvu
Copy link
Contributor

mpirvu commented Feb 12, 2024

CryptoTests_0 timeout for extended.functional_x86-64_linux_jit
I started a 1 iteration grinder for CryptoTests_0 here: https://openj9-jenkins.osuosl.org/job/Grinder/3291/

@cjjdespres
Copy link
Contributor Author

cjjdespres commented Feb 12, 2024

There were timeouts in CryptoTests_0 reported in #16711 that may or may not be related.

@mpirvu
Copy link
Contributor

mpirvu commented Feb 12, 2024

CrytoTest had a timeout even in the grinder, but given that this is a known issue I am going to say that this PR is good to be merged.

@mpirvu mpirvu merged commit 5fc665e into eclipse-openj9:master Feb 12, 2024
25 of 30 checks passed
Copy link
Contributor

@AlexeyKhrabrov AlexeyKhrabrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nitpick (see inline) for future cleanup. Good work figuring out all the intricacies of ROMClass layout @cjjdespres!

packedSize = getArrayROMClassPackedSize(romClass);
}
else if (needsDebugInfoStripping)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the code in this branch and the next one seems identical. The next time we need to make any changes to this function, it would be nice to factor out duplicated code. I guess something to add to the list in #18990.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants