Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple invokedynamic InstrumentationModules to share classloaders #10015

Merged
merged 17 commits into from
Feb 2, 2024

Conversation

JonasKunz
Copy link
Contributor

@JonasKunz JonasKunz commented Dec 6, 2023

Follow-up for this comment. Part of #8999 .

This PR solves an edge case which to my knowledge is currently only needed by the AWS v2 SDK instrumentation:
When a library we instrument consists of multiple modules, which may or may not be present on the classpath (e.g. SQS in case of AWS), we typically need to create a separate InstrumentationModule for each. At the same time, these InstrumentationModules need to communicate with each other, but only for the same classloader they instrumented.

With inlined InstrumentationModules this is done by detecting the presence/absence of helper-classes injected by other InstrumentationModule into the the target classloader.

This PR adds a mechanism with a similar effect: InstrumentationModules may declare a "group name". When multiple modules have the same group name AND target the same application classloader, they will share their InstrumentationModuleClassloader.

By default InstrumentationModules will remain isolated, because the default group name is the FQN of the InstrumentationModule.

With this PR I've had to also adapt the registration logic for invokedynamic InstrumentationModules:

  • Previously, no action happened when a class is instrumented. Only when the first invokedynamic instruction is linked, the InstrumentationModuleClassloader would be initialized
  • Now, whenever a class is instrumented, the corresponding InstrumentationModule is initialized for the instrumented classloader. If the loader for the same module group has already been created, the module will be installed into it.

This had to be done because for example the SQS-instrumentation never actually instruments a method: It is just there to match a class and inject corresponding helpers. This new registration approach ensures that this happens in the same way for indy modules.

@JonasKunz JonasKunz marked this pull request as ready for review December 6, 2023 10:16
@JonasKunz JonasKunz requested a review from a team December 6, 2023 10:16
@JonasKunz
Copy link
Contributor Author

@open-telemetry/java-instrumentation-maintainers Please add the testIndy Label to this PR

@JonasKunz JonasKunz force-pushed the multipe-modules-per-cl branch from b5867b2 to b443b47 Compare December 7, 2023 08:53
@laurit
Copy link
Contributor

laurit commented Dec 8, 2023

This PR solves an edge case which to my knowledge is currently only needed by the AWS v2 SDK instrumentation

There is a similar issue with netty instrumentation where other instrumentations use classes from netty instrumentation that currently fails with class cast

@JonasKunz
Copy link
Contributor Author

This PR solves an edge case which to my knowledge is currently only needed by the AWS v2 SDK instrumentation

There is a similar issue with netty instrumentation where other instrumentations use classes from netty instrumentation that currently fails with class cast

Do you have any pointers for me here by chance to the shared classes? Unfortunately running the tests locally is very cumbersome for me currently, due to unrelated failures.

Maybe in case of netty it might make sense to extract interfaces from those shared classes and to expose those by loading them as "shared state" in agent-Classloader instead (see the classloading section of #8999 for details). It depends on whether those extracted interfaces would need references to netty classes (because then they can't be shared).

if (instrumentedCl != null) {
return instrumentedCl.getResource(resourceName);
} else {
return BOOT_LOADER.getResource(resourceName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't both agent loader and instrumented cl already delegate to boot loader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the instrumented CL: We don't know, the Classloader could be doing something weird regarding delegation (e.g. OSGi).
For the agent loader I think we use a child/self-first loading, so that e.g. if someone puts bytebuddy on the bootstrap it doesn't affect us, right? That's at least what we do in the elastic-apm-agent, because we had that exact problem before. In that case resources from the agent loader might shadow the resource we actually want to lookup here.

That wouldn't exactly be the problem here, because we try the agent loader before anyway and it returned null when we reach this code. But if that ever changes, this could easily introduce a subtile bug here. For that reason I'd prefer to keep it explicit here with the BOOT_LOADER.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the instrumented CL: We don't know, the Classloader could be doing something weird regarding delegation (e.g. OSGi).

We instrument class loaders to force boot delegation for packages with our classes to ensure they are accessible with OSGI class loaders.

For the agent loader I think we use a child/self-first loading, so that e.g. if someone puts bytebuddy on the bootstrap it doesn't affect us, right?

Yes. There may still be ways to break it, if the libraries we use have optional components that we don't have in agent then placing them in boot loader could produce unexpected results.

That wouldn't exactly be the problem here, because we try the agent loader before anyway and it returned null when we reach this code. But if that ever changes, this could easily introduce a subtile bug here. For that reason I'd prefer to keep it explicit here with the BOOT_LOADER.

Imo you are overthinking this, if the agent class loader and the instrumented class loader can't find it then it is probably not meant to be found. I think we can keep it for now.

@laurit laurit merged commit 980d8ea into open-telemetry:main Feb 2, 2024
50 checks passed
steverao pushed a commit to steverao/opentelemetry-java-instrumentation that referenced this pull request Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants