Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to contribute invokedynamic based instrumentation mechanism #8999

Open
jackshirazi opened this issue Jul 20, 2023 · 5 comments
Open
Labels
enhancement New feature or request

Comments

@jackshirazi
Copy link
Contributor

jackshirazi commented Jul 20, 2023

Tracking

  1. The original proposal is below in this description. This was accepted for proceeding on in Java SIG meeting July 20 2023, after example walkthroughs of instrumentation being moved are provided
  2. A couple of examples of how the proposed changes would affect existing instrumentation, if moved to the new framework were provided for the following SIG meeting July 27 2023, and a decision to proceed on the proposal was then agreed
  3. The proposed architecture outline and an implementation plan outline were added in this comment and PRs for the implementation began - a working POC is available for reference

Related PRs

Note not for inclusion in this issue are preferred post completion task to remove shading from the agent

[COMPLETED] Phase 1 - Enabling invoke dynamic capability for instrumentations

Phase 2a - Migrate simple instrumentation modules

  • Write migration guide (so we can link to it in reviews)
  • Any instrumentation that can be migrated without "Phase 2b features", - this is now detailed in this PR which identifies all simple and non-simple instrumentations

Phase 2b - Additional support for muzzle and complex instrumentation modules

Phase 3 - Migrate remaining instrumentation modules

  • migrate all remaining modules

Proposal

Contribute alternative invokedynamic based instrumentation mechanism to the OpenTelemetry Java agent, with no required changes to the existing OpenTelemetry instrumentation mechanism or instrumentations

Benefits to OpenTelemetry of this contribution and mechanism

  • Allows instrumentations to have breakpoints set in them and be debugged using standard debugging techniques
  • No shading required, allowing standard debugging in IDEs
  • Works cleanly with the Java Module system
  • Clean isolation of instrumentation advice from the application and other instrumentations
  • Uses the approach recommended by Byte Buddy creator Rafael Winterhalter (see background section below)
  • Makes available additional contributors/maintainers from Elastic for OpenTelemetry
  • It's a parallel instrumentation approach (to the existing OpenTelemetry Java agent) that would live alongside the existing one and not impact the existing approach at all
  • Implementors would have the option to use either instrumentation approach, whichever they prefer
  • It will subsequently allow existing instrumentations using this technology to be contributed to the OpenTelemetry agent
  • The contribution will not require any implementation effort from existing contributors (unless existing contributors want to become involved!) except for reviewing, as Elastic will make contributors available for this. (Elastic will also try to reduce reviewing effort by adding maintainers, but this will take time)
  • It is mature technology and has been production tested for several years
  • None of the advice class restrictions apply

Benefits to Elastic

  • It will allow Elastic Java agent to converge to the OpenTelemetry agent so that Elastic can support the OpenTelemetry Java agent

Background

  1. The Elastic APM Java agent is fully open source on the Apache 2.0 license
  2. The agent instrumentation uses Byte Buddy delegated advice (allows breakpoint setting) - the recommended approach by the byte buddy creator Rafael Winterhalter, which reduces agent implementation complexity
  3. To support delegated advice, the agent has implemented custom classloaders that enable access and reversion (the instrumentation has correct isolated access, and can be unloaded and the transformed code fully reverted)
  4. To support the delegated advice being able to access the instrumented methods runtime state, the agent uses Byte Buddy to insert an invokedynamic bytecode instructions -

the invokedynamic instruction can be used to call an advice method that is loaded from a child class loader of the instrumented class' defining class loader ... this allows the agent to hide its classes from the application while providing a way to invoke the isolated methods from the application classes it instruments ... also avoids injecting the advice and helper classes into the target class loader directly

@jackshirazi jackshirazi added the enhancement New feature or request label Jul 20, 2023
@jackshirazi
Copy link
Contributor Author

Per discussion at the Java SIG, one next step is to provide a walkthrough of how an existing instrumentation is expected to change if it moved across to this mechanism after the mechanism was implemented

@jackshirazi
Copy link
Contributor Author

Per discussion at the Java SIG, the instrumentation using this mechanism is isolated and needs no shading but fully making the whole agent non-shaded using the ShadedClassLoader.java is not in this scope, but can be done in a subsequent scope if desired!

@jackshirazi
Copy link
Contributor Author

Per discussion at the Java SIG, one next step is to provide a walkthrough of how an existing instrumentation is expected to change if it moved across to this mechanism after the mechanism was implemented

This was added by @JonasKunz here providing a simple instrumentation change (Cassandra) and a complex one (Elasticsearch)

@jackshirazi
Copy link
Contributor Author

The next step is to provide a roadmap architecture of the implementation, following which we'll aim to start providing PRs of individual components using the architecture as a reference until we get to a fully working instrumentation. After that the intention is to convert all the existing instrumentations to the invokedynamic non-inlined implementation, targeting the 2.0 release in October, which will terminate support for extensions that use inline instrumentation capability

@JonasKunz
Copy link
Contributor

Here's a write-up of what we would envision as target architecture.
In addition I did a working PoC showcasing what things would like in implementation, without all the bells and whistles of course. The PoC compiles and migrates the Cassandra-Module to invokedynamic. You can execute the tests and place a break-point in the Advice code.

Classloader Structure

Based on the experience with the elastic APM agent I would envision the following "ideal" classloader structure for the OpenTelemetry Agent:

otel_cl_ideal drawio

The image shows the agent alongside two separate classloaders of the application being instrumented (App A and App B).
Arrow represent a child-to-parent relationship.
Explanation:

  • The Agent Extension API CL loads all classes against which extensions are compiled, including for example the OTel API
  • The Agent CL is a child of the Extension-API CL. Those two are separated to ensure that implementation details of the Agent are hidden from extensions at runtime just like they are at compile time. This prevents extensions (especially external ones!) from accidentally relying on those implementation details.
  • Both the Agent Extension API CL and the Agent CL should follow the "self-first" instead of "parent-first" delegation model. This ensures that their classes cannot be shadowed by accident by the user, e.g. by providing bytebuddy as a dependency via -Xbootclasspath.
  • InstrumentationModules can share state and classes via the Global-State CL. This should avoid the need for InstrumentationModules to inject classes into the Bootstrap CL, properly isolating those shared classes from the application being instrumented.
  • The Global-State CL is separated from the Agent Extension API CL for easier debugging (e.g. understanding where a class comes from) and to make sure that InstrumentationModules cannot mess up the classpath of the Agent CL. The latter point should matter less due to the Agent CL following the self-first delegation model.
    The Global-State CL should be self-first aswell, so that classes from bootstrap cannot mess it up.
  • We create at most one Instrumentation Module CL per combination of Instrumentation-Module and application classloader. Whenever an Advice is applied, we lazily create the InstrumentationModule CL for the Instrumentation Module containing that Advice and the classloader containging the class being instrumented. If that CL already exists, it is reused.
  • The Instrumentation Module CL uses the following classloading delegation model:
    • If the class to load is an Advice or a referenced helper class (and NOT a global state class), it is loaded by the Instrumentation Module CL in self-first manner. The helper classes are detected by muzzle.
    • Next, the Global-State CL (and therefor its parents) are queried to find the class
    • If still not found, the instrumented application classloader is queried
  • The classloading strategy of the Instrumentation Module CL ensures that application classes cannot shadow agent classes. We certainly need to add filtering for edge cases to this mechanism later (e.g. for bridging the opentelemetry-API from the application to the agent SDK).

To keep things simple for a start, we can keep the Agent CL, Agent Extension API CL and Global State CL fused together as a single Agent CL: This is how it is currently implemented in the elastic APM agent.

otel_cl_simplified drawio

This reduces the isolation benefits explained above, but should be simpler to implement for a start. If in the long run we often run into issues / user questions which would be avoided by the proper isolation, we can always add it afterwards.

Invokedynamic Advice Bootstraping

As discussed in the previous SIG meetings, we will add a switch-method to InstrumentationModules to allow them to be marked as compatible for the invokedynamic approach.

We will adapt the OpenTelemetry Java Agent instrumentation logic to respect this switch-method and insert invokedynamic instructions instead of inlining the Advice code.
The inserted invokedynamic instructions need to point to a bootstraping method which returns the actual target of the invokedynamic instruction in the form of a MethodHandle.

We will inject an IndyBootstrapDispatcher into the Bootstrap CL to forward this call into the actual implementation within the Agent CL. This implementation takes care of setting up the corresponding InstrumentationModule CL.

In the elastic APM agent the IndyBootstrapDispatcher is the only class which is injected into the Bootstrap (or any application classloader). There we shade this class into the java.lang package to guarantee visibility, even if custom module systems, such as OSGi, are used.

AFAIK, the OpenTelemetry Java Agent uses classloader instrumentations to guarantee visibility of injected bootstrap classes. Therefore we won't need to shade the IndyBootstrapDispatcher here. If in the future we get rid of all other use-cases for class injection, we could consider shading the IndyBootstrapDispatcher into the java.lang package, as it would allow us to get rid of specialized classloader instrumentation for individual module systems.

Interaction of Invokedynamic-Advices with Java 9 modules

When instrumenting classes which are encapsulated within Java 9 modules, it is possible that the instrumentation needs to access non-exported contents of the modules. This usually shows by having to add --add-opens compiler flags when compiling the instrumentation with Java 9+.

In order to facilitate these use-cases, InstrumentationModules can declare modules they need to have access to. The invokedynamic bootstrapping will take care of providing the InstrumentationModule CL with the required access via Instrumentation.redefineModule.

Accessing package-private / private members

The inline/injection based instrumentation approach of the OpenTelemetry agent allows to access any package-private members by injecting accessor classes into the desired packages. Because we won't allow invokedynamic-instrumentations to perform this injection, we need a different way of gaining this access.

This can be efficiently done by combining Instrumentation.redefineModule with MethodHandles.privateLookupIn. For Java 8 we can simply use setAccessible(true).
This allows us to add a MemberAccess utility to the extension API with which the advices can then simply declare MethodHandles for the required fields and methods:

private static final MethodHandle fieldGetter = MemberAccess.fieldGetter(SomeSecret.class, "secretField");

Due to the MethodHandle being static final, the performance should be equivalent to direct field accesses / method calls after JIT.

You can find a PoC implementation here.

SPI injection

Another current use-case for injecting classes into application classloaders is to inject ServiceProviders.
In order to provide a consistent development experience, we would need to find a way to make the service providers to "inject" actually live inside the InstrumentationModule CL.
I see two alternative approaches for doing this:

  • Instrument the ServiceLoader methods to return instances from the InstrumentationModule CL
  • Inject a runtime generated proxy into the target-classloader and register it for the ServiceLoader. Upon creation, this proxy creates the actual implementation via invokedynamic in the InstrumentationModule CL and delegates all calls to that instance.

Implementation Plan

Based on the outlined architecture above, I would suggest the following implementation order:

  • Implement the InstrumentationModuleClassloader
  • Adapt muzzle to include the Advices as helper classes for invokedynamic-modules
  • Add the invokedynamic bootstraping and instrumentation logic
  • Migrate a few, simple instrumentation modules to verify the mechanism
  • Add the module-opening mechanism and contribute the LDAP instrumentation module to showcase it
  • Implement the MemberAccess utility explained above and migrate at least one instrumentation modules which requires it
  • Decide on a way to support SPI injection and implement it. Migrate at least one instrumentation module using it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants