-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlalchemy: make sqlalchemy instrumentation thread safe #315
Conversation
8d3a8fe
to
a71b838
Compare
@codeboten @lzchen Sorry for pushing this once. I think this is a serious issue with the current sqlalchemy instrumentation which occurs on all systems which use sqlalchemy in production and breaks functionality there occuring into traces not being sent to jaeger and dropped due to the UDP buffer being full. So it would be good to get this merged and soon into a new version. We currently also forked the project and host an internal version, but I would really like to see it also being back in the mainline. |
@cbrand not sure if this is something a test could be added to verify or not, any thoughts? |
I couldn't think of a way to write a test which would reliably test this behavior. That's why I didn't add one (See also the comment in the checklist) |
@codeboten |
Can we add integration/docker test? It might be easier to model this use case over there. |
@owais @lzchen @codeboten I don't think that will help either as it requires accidental timing on multple threads which isn't really controllable on an integration test level. I would recommend merging this with the existing unit tests, ensuring that during "normal" flow it runs as expected. |
I'm fine with merging this if this is a critical fix but I feel strongly that it should be followed up with some sort of tests to ensure the issue this PR fixed doesn't happen again. I understand it can be harder to reproduce but if we know the exact scenario it shouldn't be hard to write an example script that deliberately triggers the case and if we have that, we have an integration test. Original description above states:
With some thread coordination and/or time.sleep, I think this should be simple enough to reproduce even in a unit test. |
Sorry for not responding any more, I suffer under some very tight deadlines at the moment. I don't have capacity to do this now or in the near future. My recommendation are thus giving the time constraints I am currently under:
I think keeping the status quo is the worst possible option though, so I would definitely recommend to not do this. |
I say let's merge this and open an issue to add the test. |
@codeboten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a followup item to add the tests here #383
Description
This change ensures the SQLAlchemy instrumentation to b thread safe.
In tests using Flask with gunicorn and uwsgi we were often getting a "Trying to end span which has already been ended" error. After checking we realized that this has to do with the SQLAlchemy
EngineTracer
using one local variable, namelycurrent_span
to try to track the span of the SQL request. If threading is involved and multiple SQL commands are running at the same time, this breaks ascurrent_span
will be overwritten before it has been ended. The new span will then be tried to be ended twice.To remedy this, the changes introduce a tracking mechanism over SQLAlchemy cursors, tracking via a weakref dictionary all spans which are currently existing on a cursor level.
To still support error handling, we also store the span in a thread local variable to esnure that
_handle_error
can be utilized. This will break if a query is executed in another thread then_before_cur_exec
has been called. I think that this is highly unlikely to actually outside of an artifical test written happen.Type of change
How Has This Been Tested?
ab
executing multiple sql statements in parallel and check jaeger if the instances are recorded as expected.Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.