Cannot find resource cognitect/aws/endpoints.edn #265

scottbale · 2025-01-29T00:31:57Z

Description

Some users of the latest releases of aws-api are experiencing an exception with the message “Cannot find resource cognitect/aws/endpoints.edn”.

Exception in thread "async-dispatch-7" clojure.lang.ExceptionInfo: Cannot find resource cognitect/aws/endpoints.edn. {}

Versions

aws-api versions 0.8.710-beta01, 0.8.711, or 0.8.723.

Root Cause(s)

Some context: The aws-api library loads endpoints.edn and other .edn files as resources from the classpath as part of normal operation. endpoints.edn loading is special in that it must be loaded asynchronously, following the (asynchronous) retrieval of the region name.

This issue occurs due to a combination of:

aws-api inadvertently performs an IO-performing action in a thread of the common JDK fork join pool (src). The action’s callback invokes core.async async/put!, resulting in the Executor creating a new Thread in the backing pool. The new thread inherits the context classloader of the common fork join pool thread. Thereafter, the core.async dispatch pool is “tainted” with at least one thread having a different context classloader than the others.
At some asynchronous time later, aws-api attempts to load the endpoints resource file (src). This happens asynchronously because there is a dependency on first retrieving the region. When this continuation happens to be assigned to one of the “tainted” threads from the core.async thread pool, the resource cannot be found because the context classloader of that thread, which is the parent/base JDK classloader unintentionally inherited from the base fork join pool, cannot access the resources of the sandboxed application.
The ThreadFactory implementation (src) of the core.async dispatch Executor thread pool (src) has no safeguards against this type of thing: the pool being populated with threads that have different context classloaders and therefore cannot be safely used interchangeably.

Workaround(s)

There are (at least) a couple of potential workarounds that both have to do with ensuring that the core.async dispatch thread pool is already fully populated prior to using aws-api, and therefore no new threads (tainted or otherwise) will be added to the pool in response to aws-api usage.

Decrease the size of the core.async pool. Decrease from default of 8 down to maybe 4 or 5 (src): “value is set via clojure.core.async.pool-size system property”. The goal is that the pool is still large enough for application’s needs, but small enough that it is already full before the bug can occur.
Force the core.async pool to be fully populated, before aws-api usage, by running N meaningless go blocks, where N is the pool size (default of 8). Something like the following should work:

(defn prepopulate-core-async-thread-pool [n]
  (dotimes [_i n]
    (a/<!! (a/go (a/timeout 100)))))

Necessary Preconditions

The symptom is extremely sporadic. It is not observed except when all of the following preconditions are true:

The JDK-native http-client is being used, which was recently released and made the default http client in latest aws-api versions.
The application is deployed within a “sandboxed” (post-delegating) classloader of a multi-classloader environment (e.g. Jetty’s WebAppClassLoader or AWS Lambda’s CustomClassLoader) such that the parent/system classloader cannot access the app’s resources.
The core.async dispatch thread pool is not yet fully populated.
At least one http request, using the aws-api JDK-native http client, completes prior to the attempt to load the endpoint resources file. (Processing the response of that request is the point at which the “tainted” pool thread is created.)
An endpoint, which is not already memoized, is requested of the aws-api Endpoint provider, resulting in the attempt to load the endpoints resource file. This can happen in at least a couple of different scenarios:
- Two (or more) aws-api clients with different regions or region providers are created. Operations are invoked on each. The first client operation may result in the creation of the tainted pool thread. The second client operation will result in attempting to load the endpoints resource file (because the second client is using a different endpoint which is not yet memoized).
- The same symptom can occur with a single aws-api client if the region is retrieved via http from IMDS. In that case, the region fetch is the necessary http request that must precede the attempt to load the endpoints resource file.
And finally, a "tainted" thread from the core.async pool must be the thread in which aws-api attempts to load the endpoints.edn resource.

The text was updated successfully, but these errors were encountered:

…265)

scottbale · 2025-01-30T21:19:39Z

This should be resolved by beta release 0.8.730-beta01.

viesti · 2025-01-31T05:53:45Z

Thank you! :)

scottbale self-assigned this Jan 29, 2025

scottbale added bug Something isn't working http-client labels Jan 29, 2025

scottbale pushed a commit that referenced this issue Jan 30, 2025

Ensure all resources are loaded with the expected class loader (issue #…

e4b4001

…265)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot find resource cognitect/aws/endpoints.edn #265

Cannot find resource cognitect/aws/endpoints.edn #265

scottbale commented Jan 29, 2025 •

edited

Loading

scottbale commented Jan 30, 2025

viesti commented Jan 31, 2025

Cannot find resource cognitect/aws/endpoints.edn #265

Cannot find resource cognitect/aws/endpoints.edn #265

Comments

scottbale commented Jan 29, 2025 • edited Loading

Description

Versions

Root Cause(s)

Workaround(s)

Necessary Preconditions

scottbale commented Jan 30, 2025

viesti commented Jan 31, 2025

scottbale commented Jan 29, 2025 •

edited

Loading