Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JITServer AOT caching failing #16099

Closed
cjjdespres opened this issue Oct 13, 2022 · 4 comments · Fixed by #16101
Closed

JITServer AOT caching failing #16099

cjjdespres opened this issue Oct 13, 2022 · 4 comments · Fixed by #16101
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project

Comments

@cjjdespres
Copy link
Contributor

Attn @mpirvu, @AlexeyKhrabrov. While testing #15949 with acmeair, I noticed a lot of messages in the log like:

#JITServer: clientUID 293357720754206506 failed to get defining class chain record for 0000000000050300 due to the AOT cache size limit; method 000000000004F7E0 won't be loaded from or stored in AOT cache

without any "AOT cache allocations exceeded maximum..." message earlier in the log. That means that at some point loading the class chain record fails for some reason unrelated to the cache limit, but the missingLoaderInfo flag is not updated properly.

@cjjdespres
Copy link
Contributor Author

I think this issue might be interfering with AOT caching in my testing of #15949; after applying an acmeair load I get these statistics for the cache:

JITServer AOT cache  statistics:
        stored methods: 0
        class loader records: 52
        class records: 1725
        method records: 0
        class chain records: 0
        well-known classes records: 0
        AOT header records: 1
        cache bypasses: 807
        cache hits: 0
        cache misses: 7111
        deserialized methods: 0
        deserialization failures: 0

and that seems implausible to me.

@cjjdespres cjjdespres changed the title JITServer AOT caching failing due to flag not being set JITServer AOT caching failing Oct 13, 2022
@AlexeyKhrabrov
Copy link
Contributor

Yeah this looks pretty bad, class chain record creation is broken. What exact version of the code are there results for?

@cjjdespres
Copy link
Contributor Author

I tested it with the version of master that the persistence PR is based on, which is commit 628e9db.

@AlexeyKhrabrov
Copy link
Contributor

I think the bug is here:

bool missingLoaderRecord = false;
bool uncachedClass = false;
classRecords[i] = getClassRecord(ramClassChain[i], missingLoaderRecord, uncachedClass);
if (missingLoaderRecord)
{
missingLoaderRecordIndexes[numMissingLoaderRecords++] = i;
}
else if (uncachedClass)
{
uncachedIndexes[uncachedRAMClasses.size()] = i;
uncachedRAMClasses.push_back(ramClassChain[i]);
}
else
{
// There must have been an allocation failure.
return NULL;
}

If the getClassRecord() succeeds and classRecords[i] is non-null, it returns null as if this was an allocation failure. Somehow we missed this in code review for the cache size limit PR. That's why we need tests for the AOT cache (#15401).

cjjdespres added a commit to cjjdespres/openj9 that referenced this issue Oct 13, 2022
Fixes: eclipse-openj9#16099
Signed-off-by: Christian Despres <despresc@ibm.com>
@mpirvu mpirvu added the comp:jitserver Artifacts related to JIT-as-a-Service project label Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants