Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gevent): fix gevent compatibility with "module cloning" approach #4863

Merged
merged 120 commits into from
Feb 13, 2023

Conversation

P403n1x87
Copy link
Contributor

@P403n1x87 P403n1x87 commented Jan 4, 2023

This pull request fixes incompatibility between ddtrace and gunicorn (and probably other frameworks that depend on gevent). It does this by making ddtrace hold copies of the modules it loads before gevent.monkey.patch_all() has been called, thus avoiding conflicts between greenlets and real threads subject to startup race conditions.

"Module cloning", as we're calling this approach, is configurable by the DD_UNLOAD_MODULES_FROM_SITECUSTOMIZE environment variable, documented in the release note.

The change accomplishes this by deleting the modules from sys.modules. This causes the Python interpreter to create new copies of the modules on import, not touching the copies originally imported before the deletion (as described in broad strokes here). The result after sitecustomize.py has executed is that ddtrace internal code holds references to modules that have not been monkeypatched by gevent, and application code holds references to modules that may have been monkeypatched by gevent. This means that ddtrace can maintain its real threads for things like remote config and profiling without needing to worry about whether they will turn into greenlets at some point.

The clearest proof that this change improves the gevent+ddtrace compatibility story is the fact that this test passes.

Tips for Reviewers

The changes in this pull request to files other than sitecustomize.py are mostly intended to make existing tests pass under this new paradigm. sitecustomize is the most important file in this diff to understand deeply for a review.

Risks

The main risk inherent to this approach is that some module behaves in an unexpected way when copied. We're mitigating this risk in two ways. The first is by making sure all existing tests pass (obviously), and the second is by putting this new functionality behind a feature flag environment variable that defaults to retaining the current behavior form 1.x.

Checklist

Reviewer Checklist

  • Title is accurate.
  • No unnecessary changes are introduced.
  • Description motivates each change.
  • Avoids breaking API changes unless absolutely necessary.
  • Testing strategy adequately addresses listed risk(s).
  • Change is maintainable (easy to change, telemetry, documentation).
  • Release note makes sense to a user of the library.

@P403n1x87 P403n1x87 added the changelog/no-changelog A changelog entry is not required for this PR. label Jan 4, 2023
@P403n1x87 P403n1x87 force-pushed the cleanup-gevent branch 23 times, most recently from fcb041b to c39d58f Compare January 10, 2023 17:52
@P403n1x87 P403n1x87 force-pushed the cleanup-gevent branch 5 times, most recently from f0158f0 to d47c047 Compare January 13, 2023 15:29
@emmettbutler emmettbutler enabled auto-merge (squash) February 13, 2023 21:19
@Yun-Kim Yun-Kim changed the title fix(gevent): allow using pre-gevent.monkey.patch_all modules fix(gevent): allow module cloning pre-gevent.monkey.patch_all Feb 13, 2023
@emmettbutler emmettbutler changed the title fix(gevent): allow module cloning pre-gevent.monkey.patch_all fix(gevent): fix gevent compatibility with "module cloning" approach Feb 13, 2023
Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to generated docs for config page changes: https://output.circle-artifacts.com/output/job/d89b9a6d-a048-408a-9406-92bd20e098fb/artifacts/0/tmp/docs/configuration.html#DD_UNLOAD_MODULES_FROM_SITECUSTOMIZE

(I don't see issues, just sharing)

Same for release notes (you have to go to _release_notes_all.html to see the release note as part of open PR)

https://output.circle-artifacts.com/output/job/d89b9a6d-a048-408a-9406-92bd20e098fb/artifacts/0/tmp/docs/_release_notes_all.html#bug-fixes

(again, no issues, just showing)

ddtrace/bootstrap/sitecustomize.py Show resolved Hide resolved
ddtrace/bootstrap/sitecustomize.py Outdated Show resolved Hide resolved
tests/contrib/gunicorn/test_gunicorn.py Outdated Show resolved Hide resolved
@brettlangdon
Copy link
Member

The only real test changes we need here are that this feature when enabled works with gunicorn/gevent in some way because it is off by default in theory our other tests should cover that the existing behavior doesn't change.

This is correct?

I assume we may want/need more tests added before enabling by default (or "auto"), but I didn't miss any other coverage that this needs to have now, right?

@emmettbutler
Copy link
Collaborator

@brettlangdon that's right!

@brettlangdon
Copy link
Member

@brettlangdon the presence or absence of gevent doesn't affect how much work module cloning does. If you'd like, I'd be happy to do some performance testing. I do think this is mergeable without that because all of the new functionality is behind a default-off feature flag.

my only real argument to performance testing it now (even just locally with some basic examples) is we are documenting it as a feature for customers, so if the performance was really bad we cannot just yank the feature entirely, we'd have to deprecate it and wait for 2.0 to remove it.

yeah, that is the drastic scenario, but good to consider if it ever was the case.

although, if the performance was bad, we could keep the feature since it would solve some customer problems, but we could document the performance impact of it and not enable it by default.

so.... 🤷🏻

@emmettbutler
Copy link
Collaborator

@brettlangdon yup, I follow your thinking. I'm inclined to wait on performance tests until we're considering enabling module cloning by default.

@emmettbutler emmettbutler self-requested a review February 13, 2023 22:45
@emmettbutler emmettbutler merged commit 1aa31d1 into DataDog:1.x Feb 13, 2023
# than the first line of the program. since that's what we're doing here,
# we cleanup aggressively beforehand to replicate the conditions at program start
# as closely as possible.
cleanup_loaded_modules(aggressive=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can be aggressive here because we still haven't loaded much?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's probably true.

P403n1x87 pushed a commit that referenced this pull request Feb 16, 2023
…5135)

This PR makes a fix to #4863 which inadvertently removed unloading the
`time` module as well as removing the `typing` module from the list of
standard library modules to not unload for module cloning. We found that
the `time` module is a special case (which is important to unload due to
interactions with gevent patching) as it is implicitly imported by
recent versions of CPython on interpreter startup, so it may be present
in `sys.modules` already.
The `typing` module is also problematic on being reloaded for older
versions of Python < 3.7, so we should not unload/reload this module as
well.
P403n1x87 added a commit that referenced this pull request Mar 28, 2023
This change ensures that the concurrent.futures submodule is unloaded
when doing module unloading (e.g. when gevent is detected). This ensures
that the module is reloaded when needed and uses the potential new state
of the cloned threading module.

## Issue insight

The captured stacks for all the processes during the thread pool
deadlock look as follows

~~~
💤🧒 Process 82315 🧵 Thread 4312286592

Thread._bootstrap
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:995)
Thread._bootstrap_inner
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:1038)
Thread.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:975)
_worker
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:83)
_WorkItem.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:58)
_wrap_execution
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:43)
Task.__call__
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:139)
Task._execute_main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:162)
SubmissionTask._main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:269)
UploadSubmissionTask._submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:591)
UploadSubmissionTask._submit_upload_request
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:626)
TransferCoordinator.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:323)
BoundedExecutor.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:474)
_wrap_submit
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:30)
ThreadPoolExecutor.submit
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:162)


💤 Process 81968 🧵 Thread 4312286592

<module>
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/bin/gunicorn:8)
run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py:67)
Application.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:231)
BaseApplication.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:72)
Arbiter.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:209)
Arbiter.sleep
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:357)
~~~

## Implementation details

The patching of `ThreadPoolExecutor.submit` has been refactored to be
performed with the internal wrapping utilities instead of `wrapt`. One
issue caused by the previous implementation was in the way `wrapt` used
module paths to locate and patch the designated callable. Since the
`concurrent` module is not being unloaded, any attempts to look up
`futures` from `concurrent` would lead to the original reference being
returned, even in cases where the reference to the cloned
`concurrent.futures` is desired. _En passant_, the refactor has also
brought to light a tolerance to passing a mandatory positional argument
to `ThreadPoolExecutor.submit` via keyword arguments.

## Testing strategy

The change includes a simple reproducing case that works with the fix in
place.

## Risks

The `concurrent` module has been added to the exclude list in the
initial implementation of #4863 because of issues observed during test
runs (in particular with framework tests). It is unclear whether the
same issues would actually arise outside the CI context and in actual
applications. However, because the `concurrent.futures` modules creates
lock objects on initialisation, and since it is imported indirectly by
`ddtrace` via its dependencies, the issue caused by the `futures`
integration requires the module to be reloaded after the gevent
monkey-patching. The compromise of this PR is to unload only the
`concurrent.futures` sub-module (and its sub-modules), which seems
enough to solve the original issue.

Fixes #5398
IlyaMichlin pushed a commit to IlyaMichlin/dd-trace-py that referenced this pull request Mar 30, 2023
…#5402)

This change ensures that the concurrent.futures submodule is unloaded
when doing module unloading (e.g. when gevent is detected). This ensures
that the module is reloaded when needed and uses the potential new state
of the cloned threading module.

## Issue insight

The captured stacks for all the processes during the thread pool
deadlock look as follows

~~~
💤🧒 Process 82315 🧵 Thread 4312286592

Thread._bootstrap
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:995)
Thread._bootstrap_inner
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:1038)
Thread.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:975)
_worker
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:83)
_WorkItem.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:58)
_wrap_execution
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:43)
Task.__call__
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:139)
Task._execute_main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:162)
SubmissionTask._main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:269)
UploadSubmissionTask._submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:591)
UploadSubmissionTask._submit_upload_request
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:626)
TransferCoordinator.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:323)
BoundedExecutor.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:474)
_wrap_submit
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:30)
ThreadPoolExecutor.submit
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:162)


💤 Process 81968 🧵 Thread 4312286592

<module>
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/bin/gunicorn:8)
run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py:67)
Application.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:231)
BaseApplication.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:72)
Arbiter.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:209)
Arbiter.sleep
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:357)
~~~

## Implementation details

The patching of `ThreadPoolExecutor.submit` has been refactored to be
performed with the internal wrapping utilities instead of `wrapt`. One
issue caused by the previous implementation was in the way `wrapt` used
module paths to locate and patch the designated callable. Since the
`concurrent` module is not being unloaded, any attempts to look up
`futures` from `concurrent` would lead to the original reference being
returned, even in cases where the reference to the cloned
`concurrent.futures` is desired. _En passant_, the refactor has also
brought to light a tolerance to passing a mandatory positional argument
to `ThreadPoolExecutor.submit` via keyword arguments.

## Testing strategy

The change includes a simple reproducing case that works with the fix in
place.

## Risks

The `concurrent` module has been added to the exclude list in the
initial implementation of DataDog#4863 because of issues observed during test
runs (in particular with framework tests). It is unclear whether the
same issues would actually arise outside the CI context and in actual
applications. However, because the `concurrent.futures` modules creates
lock objects on initialisation, and since it is imported indirectly by
`ddtrace` via its dependencies, the issue caused by the `futures`
integration requires the module to be reloaded after the gevent
monkey-patching. The compromise of this PR is to unload only the
`concurrent.futures` sub-module (and its sub-modules), which seems
enough to solve the original issue.

Fixes DataDog#5398
mabdinur pushed a commit that referenced this pull request Apr 4, 2023
This change ensures that the concurrent.futures submodule is unloaded
when doing module unloading (e.g. when gevent is detected). This ensures
that the module is reloaded when needed and uses the potential new state
of the cloned threading module.

## Issue insight

The captured stacks for all the processes during the thread pool
deadlock look as follows

~~~
💤🧒 Process 82315 🧵 Thread 4312286592

Thread._bootstrap
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:995)
Thread._bootstrap_inner
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:1038)
Thread.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:975)
_worker
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:83)
_WorkItem.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:58)
_wrap_execution
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:43)
Task.__call__
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:139)
Task._execute_main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:162)
SubmissionTask._main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:269)
UploadSubmissionTask._submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:591)
UploadSubmissionTask._submit_upload_request
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:626)
TransferCoordinator.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:323)
BoundedExecutor.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:474)
_wrap_submit
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:30)
ThreadPoolExecutor.submit
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:162)


💤 Process 81968 🧵 Thread 4312286592

<module>
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/bin/gunicorn:8)
run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py:67)
Application.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:231)
BaseApplication.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:72)
Arbiter.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:209)
Arbiter.sleep
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:357)
~~~

## Implementation details

The patching of `ThreadPoolExecutor.submit` has been refactored to be
performed with the internal wrapping utilities instead of `wrapt`. One
issue caused by the previous implementation was in the way `wrapt` used
module paths to locate and patch the designated callable. Since the
`concurrent` module is not being unloaded, any attempts to look up
`futures` from `concurrent` would lead to the original reference being
returned, even in cases where the reference to the cloned
`concurrent.futures` is desired. _En passant_, the refactor has also
brought to light a tolerance to passing a mandatory positional argument
to `ThreadPoolExecutor.submit` via keyword arguments.

## Testing strategy

The change includes a simple reproducing case that works with the fix in
place.

## Risks

The `concurrent` module has been added to the exclude list in the
initial implementation of #4863 because of issues observed during test
runs (in particular with framework tests). It is unclear whether the
same issues would actually arise outside the CI context and in actual
applications. However, because the `concurrent.futures` modules creates
lock objects on initialisation, and since it is imported indirectly by
`ddtrace` via its dependencies, the issue caused by the `futures`
integration requires the module to be reloaded after the gevent
monkey-patching. The compromise of this PR is to unload only the
`concurrent.futures` sub-module (and its sub-modules), which seems
enough to solve the original issue.

Fixes #5398
mabdinur pushed a commit that referenced this pull request Apr 4, 2023
This change ensures that the concurrent.futures submodule is unloaded
when doing module unloading (e.g. when gevent is detected). This ensures
that the module is reloaded when needed and uses the potential new state
of the cloned threading module.

## Issue insight

The captured stacks for all the processes during the thread pool
deadlock look as follows

~~~
💤🧒 Process 82315 🧵 Thread 4312286592

Thread._bootstrap
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:995)
Thread._bootstrap_inner
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:1038)
Thread.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/threading.py:975)
_worker
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:83)
_WorkItem.run
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:58)
_wrap_execution
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:43)
Task.__call__
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:139)
Task._execute_main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:162)
SubmissionTask._main
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/tasks.py:269)
UploadSubmissionTask._submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:591)
UploadSubmissionTask._submit_upload_request
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/upload.py:626)
TransferCoordinator.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:323)
BoundedExecutor.submit
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/s3transfer/futures.py:474)
_wrap_submit
(/Users/gabriele.tornetta/p403n1x87/dd-trace-py/ddtrace/contrib/futures/threading.py:30)
ThreadPoolExecutor.submit
(/Users/gabriele.tornetta/.pyenv/versions/3.11.2/lib/python3.11/concurrent/futures/thread.py:162)


💤 Process 81968 🧵 Thread 4312286592

<module>
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/bin/gunicorn:8)
run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/wsgiapp.py:67)
Application.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:231)
BaseApplication.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/app/base.py:72)
Arbiter.run
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:209)
Arbiter.sleep
(/Users/gabriele.tornetta/p403n1x87/ddtrace-gevent-bug/.venv/lib/python3.11/site-packages/gunicorn/arbiter.py:357)
~~~

## Implementation details

The patching of `ThreadPoolExecutor.submit` has been refactored to be
performed with the internal wrapping utilities instead of `wrapt`. One
issue caused by the previous implementation was in the way `wrapt` used
module paths to locate and patch the designated callable. Since the
`concurrent` module is not being unloaded, any attempts to look up
`futures` from `concurrent` would lead to the original reference being
returned, even in cases where the reference to the cloned
`concurrent.futures` is desired. _En passant_, the refactor has also
brought to light a tolerance to passing a mandatory positional argument
to `ThreadPoolExecutor.submit` via keyword arguments.

## Testing strategy

The change includes a simple reproducing case that works with the fix in
place.

## Risks

The `concurrent` module has been added to the exclude list in the
initial implementation of #4863 because of issues observed during test
runs (in particular with framework tests). It is unclear whether the
same issues would actually arise outside the CI context and in actual
applications. However, because the `concurrent.futures` modules creates
lock objects on initialisation, and since it is imported indirectly by
`ddtrace` via its dependencies, the issue caused by the `futures`
integration requires the module to be reloaded after the gevent
monkey-patching. The compromise of this PR is to unload only the
`concurrent.futures` sub-module (and its sub-modules), which seems
enough to solve the original issue.

Fixes #5398
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants