Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to start application with Python 3.11.9 + gevent + ddtrace #8903

Closed
fbexiga opened this issue Apr 8, 2024 · 17 comments
Closed

Unable to start application with Python 3.11.9 + gevent + ddtrace #8903

fbexiga opened this issue Apr 8, 2024 · 17 comments
Assignees
Labels
Profiling Continous Profling stale

Comments

@fbexiga
Copy link

fbexiga commented Apr 8, 2024

Summary of problem

When trying to start a Flask API using gunicorn + gevent + ddtrace + Python 3.11.9, the application crashes.
However, if I use Python 3.11.8 instead or remove either gevent or ddtrace, it works.
Also, I can only reproduce this issue on a Linux system (like Debian Bookworm), not on MacOS for instance.

Edit: it appears that even after downgrading to Py 3.11.8, with ddtrace 2.7.x the application doesn't start properly, although the error is different. With 2.6.x it does work as expected.

Which version of dd-trace-py are you using?

Tested 2.8.0, 2.7.7 and a few more down to 2.6.3

Which version of pip are you using?

Python 3.11.9 pip 24.0

Which libraries and their versions are you using?

ddtrace==2.7.7
flask==3.0.2
gevent==24.2.1
greenlet==3.0.3
gunicorn==21.2.0

How can we reproduce your problem?

If I try to start a Flask API using gunicorn with gevent workers + ddtrace + Python 3.11.9, i get the following error as soon as the worker boots:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/threading.py", line 989, in _bootstrap
    # Wrapper around the real bootstrap code that ignores
  File "ddtrace/profiling/_threading.pyx", line 38, in ddtrace.profiling._threading.native_id_hook.bootstrap_wrapper
  File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/lib/python3.11/threading.py", line 1049, in _bootstrap_inner
    self._delete()
  File "/usr/local/lib/python3.11/threading.py", line 1081, in _delete
    del _active[get_ident()]
        ~~~~~~~^^^^^^^^^^^^^
KeyError: 139743514440832

What is the result that you get?

I am unable to start the application, getting the error mentioned above.

What is the result that you expected?

I expected the application to start and work just like it does with an older version of Python.

@emmettbutler
Copy link
Collaborator

Thanks for reporting this, @fbexiga. If turning off the Profiling functionality is an option for your use case, it's the first thing I'd recommend. Does the error still occur when you set DD_PROFILING_ENABLED=0?

@emmettbutler
Copy link
Collaborator

cc @sanchda

@sanchda sanchda self-assigned this Apr 9, 2024
@sanchda
Copy link
Contributor

sanchda commented Apr 9, 2024

@fbexiga, thank you so much for the thorough and insightful report. Unfortunately, I don't think we have a short-term workaround, but we'll try to get this resolved promptly.

@sanchda sanchda added the Profiling Continous Profling label Apr 9, 2024
@fbexiga
Copy link
Author

fbexiga commented Apr 9, 2024

That's ok, for now we just downgraded back to 3.11.8. No rush or anything, but I thought it was worth reporting.

I tried disabling profiling but still same result.

@kc-experian
Copy link

kc-experian commented Apr 9, 2024

I have the same error in a Celery application using Python 3.11.9 + gevent + ddtrace

Traceback (most recent call last):
  File "src/gevent/_abstract_linkable.py", line 287, in gevent._gevent_c_abstract_linkable.AbstractLinkable._notify_links
  File "src/gevent/_abstract_linkable.py", line 333, in gevent._gevent_c_abstract_linkable.AbstractLinkable._notify_links
AssertionError: (None, <callback at 0x7fe8acaaa4c0 args=([],)>)
2024-04-09T21:13:55Z <callback at 0x7fe8acaaa4c0 args=([],)> failed with AssertionError

@iherasymenko
Copy link

iherasymenko commented Apr 10, 2024

Also affects Python 3.12.3; used to work just fine with 3.12.2.

@askidelskiy
Copy link

Encountered a similar exception, but we don't have profiling enabled. Also happened when moving from 3.11.8 to 3.11.9. Rolling back python version resolved the error.

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/lib/python3.11/threading.py", line 1049, in _bootstrap_inner
    self._delete()
  File "/usr/local/lib/python3.11/threading.py", line 1081, in _delete
    del _active[get_ident()]
        ~~~~~~~^^^^^^^^^^^^^
KeyError: 139737853141056

ddtrace==2.7.6
django==4.2.11
gevent==23.9.1
greenlet==3.0.3
gunicorn==21.2.0

@P403n1x87
Copy link
Contributor

There is no clear link between this issue and #8870, but it might be worth testing it once it's released 🤞 . Meanwhile we'll see if we can reproduce this issue

@lawrenceong
Copy link

Was testing this and found that the crash did not happen when we are on an Intel Processor and crashes on AMD EPYC. Disabling ddtrace prevents it from crashing on AMD EPYC.

Intel processor: Intel(R) Xeon(R) CPU @ 2.20GHz
AMD EPYC processor: AMD EPYC 7B12

Docker image = python:3.11.9-slim

ddtrace==2.8.2
flask=3.0.3
gevent==24.2.1
greenlet=3.0.3
gunicorn==22.0.0

Downgrading to python 3.11.8 stops the crash on AMD EPYC.

@fbexiga
Copy link
Author

fbexiga commented May 30, 2024

Any movement on this?

@iherasymenko
Copy link

iherasymenko commented Jun 11, 2024

Reproducible with Python 3.12.4 + gevent 24.2.1 + greenlet 3.0.3 + ddtrace 2.9.0.

UPD 1: Only reproducible together with sentry-sdk.

UPD 2: Reproducible without sentry-sdk. It was a red herring.

@JASchilz
Copy link

Also affects Python 3.12.3; used to work just fine with 3.12.2.

I likewise encountered a similar issue when using 3.12.3. Downgrading to 3.12.2 fixed the issue.

@iherasymenko
Copy link

iherasymenko commented Jun 13, 2024

I finally have a working reproducer: https://github.com/iherasymenko/ddtrace-8903-reproducer

Chasing it down required a machine with the AMD EPYC 7R13 processor (an AWS EC2 c6a.8xlarge VM) but it seems like the simplified version works fine both on my M3 MacBook Pro and my Intel Core i7 Linux machine.

ddtrace v2.10.0rc2 is still affected by the issue.

Also, in this particular example, disabling patching of mongoengine via DD_PATCH_MODULES="mongoengine:false" helps but this is not really an option as the other enabled integrations will cause the similar effect.

@ffernand
Copy link

ffernand commented Jun 21, 2024

I've also been having these issues and noticed a gevent issue showing that it's not compatible with 3.11.9. It further points to a cpython issue about the import of the threading library that happens before gevent has a chance to patch it.

There's a PR open to address this and I've tried the patch locally and I was able to get ddtrace-run & gevent to play nice on 3.11.9
python/cpython#120233

This looks to be an issue strictly with cpython on the latest patch series for 3.11 and 3.12.

EDIT: spelling

@lawrenceong
Copy link

Even though python/cpython#120233 is already merged, it looks like it will not be backported to 3.11 as it is not considered a security fix (python/cpython#120233 (comment)).

It is however, ported to 3.12 / 3.13, so it looks like we will need to upgrade unless there is a plan for gevent to update their code.

@iherasymenko
Copy link

The issue is fixed in 2.10.0 and 2.9.4 🎉

@github-actions github-actions bot added the stale label Sep 28, 2024
Copy link
Contributor

This issue has been automatically closed after a period of inactivity. If it's a
feature request, it has been added to the maintainers' internal backlog and will be
included in an upcoming round of feature prioritization. Please comment or reopen
if you think this issue was closed in error.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Profiling Continous Profling stale
Projects
None yet
Development

No branches or pull requests

10 participants