-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client signals #2313 #2429
Client signals #2313 #2429
Conversation
Implementation of the client signals exposed by the `ClientSession` class, to get a list of the all signals implementation please visit the documentation.
List of signals implemented here: https://github.com/aio-libs/aiohttp/pull/2429/files#diff-7dd84b5ef8d5eea2de1dfc5329411dfcR695 |
Codecov Report
@@ Coverage Diff @@
## master #2429 +/- ##
==========================================
+ Coverage 97.13% 97.18% +0.05%
==========================================
Files 39 40 +1
Lines 8059 8203 +144
Branches 1411 1441 +30
==========================================
+ Hits 7828 7972 +144
Misses 99 99
Partials 132 132
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt if parsers code need to be changed.
I thought these signals should be send by more high level code.
await client.request()
returns just before all response headers are get and parsed for example.
Working with body is more complicated: the proper API is resp.content.read()
and family.
The problem is the resp.content
is a stream. The stream class is shared between client and server (BTW it is true for parsers too).
We could either add on_content_received
to stream or make a new stream just for client (derive if from base stream, sure).
Both approaches have own drawbacks: implementing a new class or paying for empty subscribers list call on server.
I don't know what is better. On server every python function call matters, for client we can relax our striving for performance.
On other hand server code has too many redirection levels, maybe the degradation is negligible.
Anyway, I pretty sure that parses should not know about signals.
I hope we'll add HTTP/2 eventually, the protocol has own HTTP headers parser. I don't like adding signals to it too.
@asvetlov lets summarize what should be done to move this PR forward
anything else? |
I had my doubts about the parsers and how to implement the signals triggered by sending or receiving some pieces of the HTTP protocol. The current implementation is just an intention or a POC of how other signals that have more granularity such as But I would prefer to get rid of this variable and try to put all of us on the same page. My proposal here is get rid of the |
Sounds good |
@asvetlov I think that your requested changes are no longer valid and the branch is ready to be either merged or moving on with other reviewing stuff. |
aiohttp/client.py
Outdated
yield from self.on_request_start.send( | ||
trace_context, | ||
method, | ||
url.host, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole URL object maybe? Query part might be interested for tracer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will do that
aiohttp/client.py
Outdated
@@ -291,6 +315,9 @@ def _request(self, method, url, *, | |||
# redirects | |||
if resp.status in ( | |||
301, 302, 303, 307, 308) and allow_redirects: | |||
|
|||
self.on_request_redirect.send(trace_context, resp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is intended usage for the signal?
How to figure out what initial request was redirected?
The same is true for all other signals.
Maybe we should always pass URL, method and headers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to give a way to trace when a redirect happens. Completly agree that the signal parameters are not enough consistency to give enough information to the user, lets add the method
, URL
, headers
.
What do you mean with :: The same is true for all other signals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do the same with the on_request_end
, and on_request_exception
allowing the user to know the source URL in case the request came from a redirect.
aiohttp/client.py
Outdated
|
||
@property | ||
def on_request_queued_start(self): | ||
return self._connector.on_queued_start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think every session should have own signals.
Connector might be shared between sessions by connector_owner=False
or session.detach()
.
Shared subscriptions make a mess: nobody know when recipient is dead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oks I can see right now the idea of the connector_owner
parameter, my fault I havent check it and it head me to a wrong implementation. If at last the connector is shared between sessions its absolutely necessary don't mess the signals.
In another way worries me a bit the connector_owner
implementation, leaving to the user's hand the power of making it True
or False
, meaning that in case the user gives an alternative connector but forget to pass the connector_owner
set as False
, when the session is gonna be closed this automatically will close the connections of the connector that might be shared by another Session
.
Should the connector_owner
param handled internally by the Session
object` ?
aiohttp/connector.py
Outdated
@@ -347,8 +354,12 @@ def closed(self): | |||
""" | |||
return self._closed | |||
|
|||
async def connect(self, req): | |||
async def connect(self, req, trace_context=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's pin trace_context
to mandatory params set.
I'm ok with breaking backward compatibility in connectors API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same for other places with trace_context=None
in signatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oks.
aiohttp/connector.py
Outdated
"""Get from pool or create new connection.""" | ||
|
||
if trace_context is None: | ||
trace_context = SimpleNamespace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not create an empty context here -- just pass a parameter as is.
aiohttp/connector.py
Outdated
_, proto = await self._create_proxy_connection(req) | ||
_, proto = await self._create_proxy_connection( | ||
req, | ||
trace_context=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace_context=trace_context
I pretty sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okis.
aiohttp/connector.py
Outdated
_, proto = await self._create_direct_connection(req) | ||
_, proto = await self._create_direct_connection( | ||
req, | ||
trace_context=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace_context=trace_context
aiohttp/client.py
Outdated
@@ -218,6 +227,18 @@ def _request(self, method, url, *, | |||
handle = tm.start() | |||
|
|||
url = URL(url) | |||
|
|||
if trace_context is None: | |||
trace_context = SimpleNamespace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the API is wrong here.
User will never send a trace_context
into async with session.get()
explicitly.
It's another level of abstraction.
What the user will do is setting up session properly on initialization stage by substribing on signals and (optionally) providing a trace context factory for creating a new container for user data.
I even doubt if we need a factory parameter, at least at current stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale behind this implementation is the following one:
Give the proper freedom to the developer to have a grain control of his requests calls building trace context for each request.
Perhaps
async def on_request_start(trace_context, host, port, headers):
trace_context['start'] = loop.time()
async def on_request_end(trace_context, resp):
await send_metrics(
time=loop.time() - trace_context['start']
query=trace_context['query']
)
sesion = ClientSession()
session.add_on_requests_start(on_request_start)
session.add_on_requests_end(on_request_end)
resp = session.get("http://localhost?query=foo", trace={'query':'foo'})
resp = session.get("http://localhost?query=foo", trace={'query':'bar'})
This example shows how the same ClientSession is used to make different queries that might have divergent traces.
In case the user is keen on share information between all requests that belong to the same ClientSession
might use a closure pattern, perhaps:
def on_request_end(query):
async def on_request_end(trace_context, resp):
await send_metrics(
time=loop.time() - trace_context['start']
query=query
)
return on_request_end
sesion = ClientSession()
session.add_on_requests_start(on_request_start)
session.add_on_requests_end(on_request_end(query='foo'))
I can see that your point about forcing the user to populate each request call can be less kindy, but from my experience have the way to pass a context that has information about the current execution is a must. Also, take into account that having this granularity allows the user to implement Session or Requests context.
Another solution would pass for implement two different contexts, one for the session and another one for the request. But IMHO overcomplicates right now the implementation.
Thoughts ?
New comments. |
Something that I don't like about the current proposal after applying two comments regarding your concerns::
To fix the second issue, trying to minimize the number of parameters needed I will be keen on creating an object that stores all of the dependencies needed to send a signal, perhaps:: class Trace:
def __init__(self, session, trace_context):
self._session = session
self._trace_context = trace_context
def on_connect_start(self, host):
self._session.on_connect_start.send(self._trace_context, host)
def connect(self, host, trace):
trace.on_connect_start.send(host)
trace.on_connect_start.send(host) The first issue, that was other the concerns that you expressed @asvetlov, related to make the def connect(self, host, trace=None):
if trace:
trace.on_connect_start.send(host)
if trace:
trace.on_connect_start.send(host) The
I'm still wondering if it's the best implementation have the signals attached to trace_request = TraceRequest()
trace_request.on_request_start.append(my_function)
session = ClientSession(..., trace_request=trace_request) When the developer wants to trace a specific client session gives it as a parameter, meanwhile the default behavior doesn't trace the requests. It will allow us to use the Worth mentioning that having the signals to trace a request decoupled from the session, it forces us to pass the session instance as the first parameter for all of the signals. Thoughts ? |
I couldn't help it and I wrote a POC of the last proposal changes without testing here [1], two things that I dont like it
[1] pfreixes@4caddd9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments here. + biggest request: provide some visualization of signals during request lifetime. It will help a lot to quickly understand what calls when.
aiohttp/signals.py
Outdated
""" | ||
Sends data to all registered receivers. | ||
""" | ||
yield from self._send(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async/await
docs/client.rst
Outdated
print("Starting request") | ||
|
||
async def on_request_end(trace_context, resp): | ||
print("Ending request") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording bit: is on_request_end
happens just before request is ended or when it's actually ended. -ing
suffix confusing here.
aiohttp/client.py
Outdated
# cleanup timer | ||
tm.close() | ||
if handle: | ||
handle.cancel() | ||
handle = None | ||
|
||
yield from self.on_request_exception.send( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await
aiohttp/client.py
Outdated
@@ -354,15 +386,29 @@ def _request(self, method, url, *, | |||
handle.cancel() | |||
|
|||
resp._history = tuple(history) | |||
yield from self.on_request_end.send( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await
aiohttp/client.py
Outdated
if trace_context is None: | ||
trace_context = SimpleNamespace() | ||
|
||
yield from self.on_request_start.send( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await
tests/test_client_session.py
Outdated
def test_request_tracing_proxies_connector_signals(loop): | ||
connector = TCPConnector(loop=loop) | ||
session = aiohttp.ClientSession(connector=connector, loop=loop) | ||
assert id(session.on_request_queued_start) == id(connector.on_queued_start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to do simpler is
check?
tests/test_client_session.py
Outdated
@@ -474,3 +476,121 @@ def test_client_session_implicit_loop_warn(): | |||
|
|||
asyncio.set_event_loop(None) | |||
loop.close() | |||
|
|||
|
|||
@asyncio.coroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you know what (:
@asvetlov and @kxepal the way to implement the client signals have been rewritten at least the way are exposed to the user and how the user has to enable them into the ClientSession object, its based on the rationale exposed in this comment [1]. A good point of start is the documentation [2]. Disclaimer: I didn't take care of the awaitish work that has to be done in the [1] #2429 (comment) |
Readability beats code succinctness, at least for libraries. |
I don't like a list of traceconfig objects but support your idea in general.
I believe the proposal keeps internals simple but allows to specify multiple tracers easy. |
The good thing about the array is that it follows the same idea of the @asvetlov ^^ |
Do you want to support both single value and a list? |
IMHO doesnt matter, for the sake of API clearness we can force to give always a list, if not this is something that can be handled internally by the self._trace_configs = trace_configs if isinstance(trace_configs, list) else [trace_configs] |
Ok |
@asvetlov and @kxepal more changes, now the I think that the documentation reflects pretty well these changes [1], please read it and review it. Sorry for the many changes, but I'm an almost absolutely believe that these changes are necessary for the user. |
|
Thanks for hard work @pfreixes The PR is really very huge, I'm inclining to merge it after fixing very obvious changes and continue the work on client tracing in next PRs. Including a discussion for every supported signal parameters. |
docs/client_reference.rst
Outdated
@@ -294,6 +294,12 @@ The client session supports the context manager protocol for self closing. | |||
|
|||
.. versionadded:: 2.3 | |||
|
|||
:param trace_context: Object used to give as a param for the all signals | |||
triggered by the ongoing request. Default uses the object returned | |||
by the :class:`TraceConfig.trace_context()` method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:meth:TraceConfig.trace_context
maybe. Parenthesis is not required by sphinx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer required, this documentation has been deprecated.
docs/tracing_reference.rst
Outdated
Property that gives access to the signals that will be executed when a | ||
request starts, based on the :class:`~signals.Signal` implementation. | ||
|
||
The coroutines listening will receive as a param the `session`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spinx prefer double backticks, e.g. `session` should be replaced by ``session``
.
On other hand CPython documentation uses *session*
for parameters bu looks like in aiohttp docs we have not pronounced agreement to use double backticks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, using `` as you proposed.
tests/test_client_request.py
Outdated
@@ -500,7 +500,8 @@ def test_gen_netloc_no_port(make_request): | |||
'012345678901234567890' | |||
|
|||
|
|||
async def test_connection_header(loop, conn): | |||
@asyncio.coroutine | |||
def test_connection_header(loop, conn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing async def
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fault, fixed.
Just to put all of us in the same page, sure the large MR and many changes do not help. The trace_context naming is reserved for the TraceConfig objects that use this param to pass along the different signals a place to store internal information related to the same request and same TraceConfig Meanwhile the trace_request_context is the param that the user can pass at the begining of the request that can be used at any signal. My proposal for the sake of clearness will be:
Or other alternatives? |
|
Documentation accommodated to the new architecture, basically added the tracing as a new section of the More or less the PR is done to be Merged or move on with the reviewing. Disclaimer: once the code is merged I will open a discussion about if there is any chance to make a port to the 2.X series for this feature to have later a new 2.4 release. The goal is to receive feedback to the community and improve the API for the 3.X version and also important make it usable by the developers right now. |
🎉 |
@pfreixes the PR has merged. Personally I doubt if we need 2.4 release:
|
I support @kxepal emoji. Sorry, I'm not emoji jedi but share the delight of merging the proposal into master branch! |
Oks January is not so far away. But my intentions are starting to work with the support of AWS XRay for Aiohttp and also refactor one of the middlewares that instrumentalizes the calls made through ClientSession, using for both the new tracing system. I hope that this is gonna give us a bit of feedback. |
Check a concept on real usage is always very useful. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs. |
What do these changes do?
Implementation of the client signals exposed by the
ClientSession
class, to get a list of the all signals implementation please visit
the documentation.
Are there changes in behavior for the user?
Yes
Related issue number
#2313
Checklist
CONTRIBUTORS.txt
changes
folder<issue_id>.<type>
for example (588.bug)issue_id
change it to the pr id after creating the pr.feature
: Signifying a new feature..bugfix
: Signifying a bug fix..doc
: Signifying a documentation improvement..removal
: Signifying a deprecation or removal of public API..misc
: A ticket has been closed, but it is not of interest to users.