-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WSGI fixes #148
WSGI fixes #148
Conversation
Oberon00
commented
Sep 18, 2019
•
edited
Loading
edited
- Fix http.url (WSGI integration does nonstandard split between http.host and http.url #143)
- Don't delay calling wrapped app (would be nicer if No way to use_span withouth ending it. #147 was fixed; see Implement WSGI middleware integration. #84 (comment) and following)
88f5e91
to
8aee33c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Oberon00 This looks great! I have some questions / suggestions, thanks for taking the time to improve this!
if environ.get("SERVER_PORT", "443") != "443": | ||
host += ":" + environ["SERVER_PORT"] | ||
elif environ.get("SERVER_PORT", "80") != "80": | ||
host += ":" + environ["SERVER_PORT"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I didn't realize that http.host
was nonstandard! Thanks for catching this!
Given that it's not standard, should we just remove this? It seems like a lot of overhead per request to compute the host string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the calculation for host is particularly expensive and it could probably even be optimized. I'd wait until the data formats are reworked (this is something that will most likely happen).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shortened & hopefully optimized it.
ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py
Outdated
Show resolved
Hide resolved
scheme = environ["wsgi.url_scheme"] + "://" | ||
if not urlparts.netloc: | ||
url = host + url | ||
url = scheme + url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, are these cases that we have to cover?
self.environ["RAW_URI"] = "/?"
self.environ["RAW_URI"] = "http:///?"
self.environ["RAW_URI"] = "http://?"
These cases seem to add a lot of complexity. I'm curious to know which WSGI servers write these RAW_URI
strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking more at this, in order to be completely spec compliant it feels like calling url = wsgiref_util.request_uri(environ)
on every request would probably be easiest rather than playing with RAW_URI
and REQUEST_URI
as an optimization. What are your thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand, I think using RAW_URI et al is valuable because it catches things like http://myhost/?
vs http://myhost/
, a difference that should not matter but could. And since one of the prime use cases of tracing is debugging, I think exactly these edge cases are interesting to handle. But on the other hand, I might have gone overboard with my handling of malformed/very unusual raw URIs. Maybe there is a middle ground to aim for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we can get something like http://foobar.org/abc#CrazyFragment
and how do we plan to handle it.
I swear that I've seen something like this from server side (which means both client and server are super buggy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get that all in RAW_URL, we'll set it to the http.url
attribute as-is. If we get invalid data, we are not in the business to normalize it, instead we should make it visible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified it:
- If the raw URL starts with
/
we assume it is relative and prepend scheme://host - Otherwise, if the url starts with "scheme:", we use it as-is
- Otherwise we assume a bogus value and fall back to wsgiref.util.request_uri.
return iter_result() | ||
except: # noqa | ||
span.end() | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why raise again here instead of making this a finally
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it was because the assumption that Span.end()
gets called in iter_result
when there is no exception.
This piece of code is a bit risky and hard to maintain IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decided to go with this approach, I'll be okay if we can put a comment like this :)
# ATTENTION!!! HIGH VOLTAGE!!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can determine if a span has ended, then finally: if not ended: end()
would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decided to go with this approach
I'm all for a better approach if somebody finds one that still doesn't delay the app invocation.
See also: #148 (comment)
I can't think of a way that doesn't have a slim chance of either ending the span twice or not ending it at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By making
__call__
a generator, we only start the span once the server starts iterating.
That's very subtle, good catch @Oberon00.
This LGTM, but I agree that playing weird URI whack-a-mole may not be worth the effort.
|
||
return iter_result() | ||
except: # noqa | ||
span.end() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be risky.
According to the API, implementation could raise exception if span.end()
is called twice.
If an exception happened right before iter_result
returns (but after span.end()
is called), we end up calling end()
twice and kill the user app due to unwanted exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to change the API so that we don't raise on duplicate calls to end
, which seems like a good idea regardless of this PR.
I've got an open PR in specs about suppressing errors in the API. The goal is to avoid exactly this kind of thing -- crashing the application because we're using the instrumentation layer wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems there is no way to do it right with the current API then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably leave a TODO comment for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the current SDK only logs a warning, which is fine because having end called twice here should really be an edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For end
in particular the spec already says that calling it multiple times is fine, see #154 (comment)
ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Allan Feldman <6374032+a-feld@users.noreply.github.com>
@a-feld Is there still something you want to have changed? |
Sorry for the delay @Oberon00 , I'm giving this another once over! 🙏 |
and environ["wsgi.url_scheme"] == "http" | ||
or port != "443" | ||
): | ||
host += ":" + port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic appears to always attach port 80 in the presence of http
. Is that intentional?
environ = {"SERVER_NAME": "example.com", "SERVER_PORT": "80", "wsgi.url_scheme": "http"}
host = environ.get("HTTP_HOST")
if not host:
host = environ["SERVER_NAME"]
port = environ["SERVER_PORT"]
if port != "80" and environ["wsgi.url_scheme"] == "http" or port != "443":
host += ":" + port
print(host)
example.com:80
The logic is equivalent to:
port != "443" or environ["wsgi.url_scheme"] == "http" and port != "80"
so all http ports will be reported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly not. I thought I had a unit test against exactly that. Will investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Oberon00 Thanks for putting this together, from what I can see this looks great! I'll approve this now assuming that the http port investigation will conclude!
@reyang I intented to fix the port issue before merging this, but oh well, I can also file a follow-up PR. |
for start_time and end_time Make lint happy Addressing comments Addressing comments Allowing 0 as start and end time Fix lint issues Metrics API RFC 0003 cont'd (open-telemetry#136) * Create functions Comments for Meter More comments Add more comments Fix typos * fix lint * Fix lint * fix typing * Remove options, constructors, seperate labels * Consistent naming for float and int * Abstract time series * Use ABC * Fix typo * Fix docs * seperate measure classes * Add examples * fix lint * Update to RFC 0003 * Add spancontext, measurebatch * Fix docs * Fix comments * fix lint * fix lint * fix lint * skip examples * white space * fix spacing * fix imports * fix imports * LabelValues to str * Black formatting * fix isort * Remove aggregation * Fix names * Remove aggregation from docs * Fix lint * metric changes * Typing * Fix lint * Fix lint * Add space * Fix lint * fix comments * address comments * fix comments Adding a working propagator, adding to integrations and example (open-telemetry#137) Adding a full, end-to-end example of propagation at work in the example application, including a test. Adding the use of propagators into the integrations. Metrics API RFC 0009 (open-telemetry#140) * Create functions Comments for Meter More comments Add more comments Fix typos * fix lint * Fix lint * fix typing * Remove options, constructors, seperate labels * Consistent naming for float and int * Abstract time series * Use ABC * Fix typo * Fix docs * seperate measure classes * Add examples * fix lint * Update to RFC 0003 * Add spancontext, measurebatch * Fix docs * Fix comments * fix lint * fix lint * fix lint * skip examples * white space * fix spacing * fix imports * fix imports * LabelValues to str * Black formatting * fix isort * Remove aggregation * Fix names * Remove aggregation from docs * Fix lint * metric changes * Typing * Fix lint * Fix lint * Add space * Fix lint * fix comments * handle, recordbatch * docs * Update recordbatch * black * Fix typo * remove ValueType * fix lint Console exporter (open-telemetry#156) Make use_span more flexible (closes open-telemetry#147). (open-telemetry#154) Co-Authored-By: Reiley Yang <reyang@microsoft.com> Co-Authored-By: Chris Kleinknecht <libc@google.com> WSGI fixes (open-telemetry#148) Fix http.url. Don't delay calling wrapped app. Skeleton for azure monitor exporters (open-telemetry#151) Add link to docs to README (open-telemetry#170) Move example app to the examples folder (open-telemetry#172) WSGI: Fix port 80 always appended in http.host (open-telemetry#173) Build and host docs via github action (open-telemetry#167) Add missing license boilerplate to a few files (open-telemetry#176) sdk/trace/exporters: add batch span processor exporter (open-telemetry#153) The exporters specification states that two built-in span processors should be implemented, the simple processor span and the batch processor span. This commit implements the latter, it is mainly based on the opentelemetry/java one. The algorithm implements the following logic: - a condition variable is used to notify the worker thread in case the queue is half full, so that exporting can start before the queue gets full and spans are dropped. - export is called each schedule_delay_millis if there is a least one new span to export. - when the processor is shutdown all remaining spans are exported. Implementing W3C TraceContext (fixes open-telemetry#116) (open-telemetry#180) * Implementing TraceContext (fixes open-telemetry#116) This introduces a w3c TraceContext propagator, primarily inspired by opencensus. fix time conversion bug (open-telemetry#182) Introduce Context.suppress_instrumentation (open-telemetry#181) Metrics Implementation (open-telemetry#160) * Create functions Comments for Meter More comments Add more comments Fix typos * fix lint * Fix lint * fix typing * Remove options, constructors, seperate labels * Consistent naming for float and int * Abstract time series * Use ABC * Fix typo * Fix docs * seperate measure classes * Add examples * fix lint * Update to RFC 0003 * Add spancontext, measurebatch * Fix docs * Fix comments * fix lint * fix lint * fix lint * skip examples * white space * fix spacing * fix imports * fix imports * LabelValues to str * Black formatting * fix isort * Remove aggregation * Fix names * Remove aggregation from docs * Fix lint * metric changes * Typing * Fix lint * Fix lint * Add space * Fix lint * fix comments * handle, recordbatch * docs * Update recordbatch * black * Fix typo * remove ValueType * fix lint * sdk * metrics * example * counter * Tests * Address comments * ADd tests * Fix typing and examples * black * fix lint * remove override * Fix tests * mypy * fix lint * fix type * fix typing * fix tests * isort * isort * isort * isort * noop * lint * lint * fix tuple typing * fix type * black * address comments * fix type * fix lint * remove imports * default tests * fix lint * usse sequence * remove ellipses * remove ellipses * black * Fix typo * fix example * fix type * fix type * address comments Implement Azure Monitor Exporter (open-telemetry#175) Span add override parameters for start_time and end_time (open-telemetry#179) CONTRIBUTING.md: Fix clone URL (open-telemetry#177) Add B3 exporter to alpha release table (open-telemetry#164) Update README for alpha release (open-telemetry#189) Update Contributing.md doc (open-telemetry#194) Add **simple** client/server examples (open-telemetry#191) Remove unused dev-requirements.txt (open-telemetry#200) The requirements are contained in tox.ini now. Fx bug in BoundedList for Python 3.4 and add tests (open-telemetry#199) * fix bug in BoundedList for python 3.4 and add tests collections.deque.copy() was introduced in python 3.5, this commit changes that by the deque constructor and adds some tests to BoundedList and BoundedDict to avoid similar problems in the future. Also, improve docstrings of BoundedList and BoundedDict classes Move util.time_ns to API. (open-telemetry#205) Add Jaeger exporter (open-telemetry#174) This adds a Jeager exporter for OpenTelemetry. This exporter is based on https://github.com/census-instrumentation/opencensus-python/tree/master/contrib/opencensus-ext-jaeger. The exporter uses thrift and can be configured to send data to the agent and also to a remote collector. There is a long discussion going on about how to include generated files in the repo, so for now just put them here. Add code coverage Revert latest commit Fix some "errors" found by mypy. (open-telemetry#204) Fix some errors found by mypy (split from open-telemetry#201). Update README for new milestones (open-telemetry#218) Refactor current span handling for newly created spans. (open-telemetry#198) 1. Make Tracer.start_span() simply create and start the Span, without setting it as the current instance. 2. Add an extra Tracer.start_as_current_span() to create the Span and set it as the current instance automatically. Co-Authored-By: Chris Kleinknecht <libc@google.com> Add set_status to Span (open-telemetry#213) Initial commit Initial version