Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot find resource cognitect/aws/endpoints.edn #265

Open
scottbale opened this issue Jan 29, 2025 · 2 comments
Open

Cannot find resource cognitect/aws/endpoints.edn #265

scottbale opened this issue Jan 29, 2025 · 2 comments
Assignees
Labels
bug Something isn't working http-client

Comments

@scottbale
Copy link
Collaborator

scottbale commented Jan 29, 2025

Description

Some users of the latest releases of aws-api are experiencing an exception with the message “Cannot find resource cognitect/aws/endpoints.edn”.

Exception in thread "async-dispatch-7" clojure.lang.ExceptionInfo: Cannot find resource cognitect/aws/endpoints.edn. {}

Versions

aws-api versions 0.8.710-beta01, 0.8.711, or 0.8.723.

Root Cause(s)

Some context: The aws-api library loads endpoints.edn and other .edn files as resources from the classpath as part of normal operation. endpoints.edn loading is special in that it must be loaded asynchronously, following the (asynchronous) retrieval of the region name.

This issue occurs due to a combination of:

  1. aws-api inadvertently performs an IO-performing action in a thread of the common JDK fork join pool (src). The action’s callback invokes core.async async/put!, resulting in the Executor creating a new Thread in the backing pool. The new thread inherits the context classloader of the common fork join pool thread. Thereafter, the core.async dispatch pool is “tainted” with at least one thread having a different context classloader than the others.
  2. At some asynchronous time later, aws-api attempts to load the endpoints resource file (src). This happens asynchronously because there is a dependency on first retrieving the region. When this continuation happens to be assigned to one of the “tainted” threads from the core.async thread pool, the resource cannot be found because the context classloader of that thread, which is the parent/base JDK classloader unintentionally inherited from the base fork join pool, cannot access the resources of the sandboxed application.
  3. The ThreadFactory implementation (src) of the core.async dispatch Executor thread pool (src) has no safeguards against this type of thing: the pool being populated with threads that have different context classloaders and therefore cannot be safely used interchangeably.

Workaround(s)

There are (at least) a couple of potential workarounds that both have to do with ensuring that the core.async dispatch thread pool is already fully populated prior to using aws-api, and therefore no new threads (tainted or otherwise) will be added to the pool in response to aws-api usage.

  1. Decrease the size of the core.async pool. Decrease from default of 8 down to maybe 4 or 5 (src): “value is set via clojure.core.async.pool-size system property”. The goal is that the pool is still large enough for application’s needs, but small enough that it is already full before the bug can occur.
  2. Force the core.async pool to be fully populated, before aws-api usage, by running N meaningless go blocks, where N is the pool size (default of 8). Something like the following should work:
(defn prepopulate-core-async-thread-pool [n]
  (dotimes [_i n]
    (a/<!! (a/go (a/timeout 100)))))

Necessary Preconditions

The symptom is extremely sporadic. It is not observed except when all of the following preconditions are true:

  • The JDK-native http-client is being used, which was recently released and made the default http client in latest aws-api versions.
  • The application is deployed within a “sandboxed” (post-delegating) classloader of a multi-classloader environment (e.g. Jetty’s WebAppClassLoader or AWS Lambda’s CustomClassLoader) such that the parent/system classloader cannot access the app’s resources.
  • The core.async dispatch thread pool is not yet fully populated.
  • At least one http request, using the aws-api JDK-native http client, completes prior to the attempt to load the endpoint resources file. (Processing the response of that request is the point at which the “tainted” pool thread is created.)
  • An endpoint, which is not already memoized, is requested of the aws-api Endpoint provider, resulting in the attempt to load the endpoints resource file. This can happen in at least a couple of different scenarios:
    • Two (or more) aws-api clients with different regions or region providers are created. Operations are invoked on each. The first client operation may result in the creation of the tainted pool thread. The second client operation will result in attempting to load the endpoints resource file (because the second client is using a different endpoint which is not yet memoized).
    • The same symptom can occur with a single aws-api client if the region is retrieved via http from IMDS. In that case, the region fetch is the necessary http request that must precede the attempt to load the endpoints resource file.
  • And finally, a "tainted" thread from the core.async pool must be the thread in which aws-api attempts to load the endpoints.edn resource.
@scottbale
Copy link
Collaborator Author

This should be resolved by beta release 0.8.730-beta01.

@viesti
Copy link

viesti commented Jan 31, 2025

Thank you! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working http-client
Projects
None yet
Development

No branches or pull requests

2 participants