-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Middleware factory is run for every single request #2225
Comments
The behavior is intentional, it's here starting from very first middleware support implementation. |
I'm slightly shocked by this; until I read this issue I had always assumed factories where called once, not for every request. I never even thought to check. If they're called for every request, what's the point in having the outer coroutine? I've just looked now at a project I'm working on, almost every middleware has processing outside the "handler" which I had assumed would only be called once. None of this processing will take very long so when testing or deploying with small or moderate usage it wouldn't be obvious, but if performance becomes important this could have been a problem. Apparently this surprising behaviour isn't documented. (or am I being blind?) Solutions (in order of my preference):
I'm guessing 1) is too big a change to be accepted? (My guess would be, 90%+ of developers assume this behaviour anyway, so it shouldn't cause problems) Perhaps 2) would be possible if current behaviour was allowed but just raises a depreciation warning for a few releases? |
Looks like the point of running the middlewares each time to thread the actual request handler through the middleware chain, so number 1 can't be done. As for number 2, I'd imagine that's so any middleware setup that needs to do IO doesn't block. Number 3 is probably the best, unless the outer coroutine could be deprecated. |
There is closed PR for composing middlewares chain only once: #565 I'm not convinced if we should optimize this code unless benchmark will show significant performance degradation by chain creation. |
@asvetlov As for the measurements of performance and implementation suggestions, I'll try to prepare something today. Gotta look at the closed pr as well. |
I wouldn't expect it to make any difference under normal conditions (and it doesn't, see below), the point is that their's a risk developers assume the factories are only called once and therefore wrongly call setup code in the outer coroutine. This is a particularly big problem if people release helper packages implemented as middleware where some setup is required, because in this case there's no way at all to add initialisation logic when just adding middleware. I've tried removing the factory: if resp is None:
handler = match_info.handler
for app in match_info.apps[::-1]:
# for factory in app._middlewares:
# handler = yield from factory(app, handler)
for m in app._middlewares:
handler = partial(m, handler=handler)
resp = yield from handler(request) so middleware is defined as just a simple coroutine: async def my_middleware(request, handler):
return await handler(request) it seems to has no effect on performance. |
Initial idea for middleware was that middleware itself looks like regular handler: it accepts request and returns response instance (or raises http exception). Regarding to docs update for more explicit describing current behavior -- I very appreciate it. |
It would be pretty easy to allow "old style middleware" and just raise a warning but process it as we do now. I completely understand what you mean by "without any benefit" (no bugs fixed, no new feature), but I think there are some benefits:
Perhaps we should add a note to the docs and leave it for now. If other developers admit they fell into this trap, then perhaps there's sufficient reason to make the change? |
@samuelcolvin I'm +0 for proposed change. |
I like the simplicity of @samuelcolvin's solution, it's definitely a more straightforward way of getting the current functionality. But I also think, that there may be a benefit to middleware factories that are run just once when freezing the app. I know it's not an acceptable implementation, but my POC looks like this: diff --git a/aiohttp/web.py b/aiohttp/web.py
index 6caee4c..65f1cf4 100644
--- a/aiohttp/web.py
+++ b/aiohttp/web.py
@@ -301,8 +301,8 @@ class Application(MutableMapping):
if resp is None:
handler = match_info.handler
for app in match_info.apps[::-1]:
- for factory in app._middlewares:
- handler = yield from factory(app, handler)
+ # assert below needs to look at the middlewares of the last app...
+ pass
resp = yield from handler(request)
@@ -446,6 +446,16 @@ def run_app(app, *, host=None, port=None, path=None, sock=None,
# add_signal_handler is not implemented on Windows
pass
+ # I know I'm doing that for only one app, which won't fly with subapps.
+ # This is just a hack to show my thinking.
+ for resource in app.router.resources():
+ # another dirty hack; implementing this whole thing properly will probably take some
+ # thinking about interfaces of a few classes.
+ http_handler = resource._routes[0]._handler
+ for factory in app._middlewares:
+ http_handler = loop.run_until_complete(factory(app, http_handler))
+ resource._routes[0]._handler = http_handler
+
try:
print("======== Running on {} ========\n"
"(Press CTRL+C to quit)".format(', '.join(uris)))
Available here. For my application (and middleware factory), the app can respond with 401 about twice as fast: 2488 requests/second on commit d2236cfe (the last one from master) vs 4090 reqs/sec on 3ab39dd97 (mine) It'll matter when the middleware factory does a bit more complex setup, or calls something over the network, e.g. get some public keys to verify JWTs. Right now, with the current implementation of middlewares, I'll be doing that this one-time preparation outside of the framework, so I'll have a factory, that produces a factory, that produces the wrapped handler. Smells like corporate Java :) I know that changing the way it works right now can potentially break someones code, so we totally shouldn't do it. If we'd be going with my approach, we can introduce a new parameter - |
You should be doing this in startup coroutines and attaching the result to the app with |
Yeah, that's one way of doing it. The thing about being easy to understand depends on the specific cases, I think. For example, with the way I'm setting up my JWT parsing middleware right now I have a high cohesion and decoupling. The middleware's client (the code that set's up the application) doesn't worry about the needs of the middleware, the middleware code sets itself up. But right now I'll be doing more or less what you describe, because middlewares don't work the way I assumed :) To be precise it'll be a factory function creating the current standard middleware factory, and it will be called from the code that does the application setup (in a file called app_setup.py, in my case). |
Another thing occurred to me about validity of my approach. I need to exclude some endpoints from the authorization checks. I want implicit inclusion and explicit exclusion in authorization to avaid security issues by omission of including a handler in authorization. So I don't only need to have a startup code for the middleware - I need it to be able to go through the handlers and decide whether it will wrap them or not. And my case is a bit more complex than looking up a handler function in a set to see if it's excluded: async def token_reading_middleware(app, handler):
@wraps(handler)
async def middleware_handler(request):
# ...omitted...
# Getting to the original functions, so we can disregard the wrappers that AioHTTP creates
# and other potential middlewares on the processing chain.
handler_original_function = _get_original_function(handler)
non_secured_original_handlers = {_get_original_function(f)
for f in app.get(NON_SECURED_HANDLERS_KEY, [])}
if handler_original_function in non_secured_original_handlers:
return handler
return middleware_handler
def _get_original_function(function):
"""Assuming that `function` is wrapped in any level of decorators, this will return the original
function without any decorators.
"""
while hasattr(function, '__wrapped__'):
function = function.__wrapped__
return function True, that can be done during handling of every request, and the slowdown isn't significant compared to all the thing that the framework is doing, but by gradually cutting down these kinds of things we can get a faster framework. And if I'm faced with picking frameworks that have similar functionality, I will pick the faster one. |
I suspect you'd be better off using |
@samuelcolvin That's a good idea, I'll do that :) But my general case remains. @asvetlov You would consider my one time factories when I provide a proper implementation? |
@samuelcolvin Thanks for the work :) |
Yes. It's the same object. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs. |
Long story short
Maybe that's just my unwaranted assumption, but I thought that a middleware factory should run once for every handler, creating the wrapped handlers. Turns out, it's run for every request. This seems to be terribly inefficient.
Is this an expected behaviour, something that I'm seeing for some strange reasons, or a regression in the library?
Expected behaviour
Run each middleware factory for each handler at one point, at the start of application's lifecycle. Let the new handlers handle requests from that point without being constantly recreated.
Actual behaviour
Middleware factory is run for every request to a handler.
Steps to reproduce
What I'm creating is a middleware that will parse JWT tokens. I need a mechanism to exculde this parsing on some handlers, since some will be accesses without authorization. I didn't use auihttp_security, since it imposes structure that I don't need (I don't have sessions, for example). I may open-source this code later.
The important bits of my code (with some alterations):
After starting the app (
python app.py
) I call http://localhost:8080 consecutively. I'm getting 401s because I don;t pass the authorization header.The problem is, that he application keeps printing 'XXX' for every request, to I know that it jumps through all the hoops of preparing the wrapper and wrapping the handler every time.
Your environment
AioHTTP 2.2.5
Python 3.6.2
Kubuntu 16.04.3 LTS x64
The text was updated successfully, but these errors were encountered: