Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added first version of normalize_path_middleware #1118

Merged

Conversation

argaen
Copy link
Member

@argaen argaen commented Aug 24, 2016

What do these changes do?

Adds a normalize_path_middleware middleware that deals with trailing slashes and double slashes in path.

As a side effect, also Request.clone() was implemented.

Are there changes in behavior for the user?

Users will now be able to do:

from aiohttp import web
from aiohttp.middlewares import normalize_path_middleware
app = web.Application(middlewares=[normalize_path_middleware])

Related issue number

#355

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes

Extra checklist

  • Tests for Request.clone
  • Add args to middleware for append_slash and merge_slashes

@argaen argaen changed the title WIP Added first version of normalize_path_middleware [WIP] Added first version of normalize_path_middleware Aug 24, 2016
@argaen
Copy link
Member Author

argaen commented Aug 24, 2016

I'm adding the changes to look for feedback. Right now the test for the append_slash is working. I'll start now with the double slashes one.

@asvetlov
Copy link
Member

No, request.clone() shouldn't accept RawRequestMessage -- the later lies on very low abstraction level, much lower than all aiohttp.web operations.
The signature should be like this:
request.clone(*, method=_sentinel, path=_sentinel, ...)
Not needed to provide changing all request's attrs, let's limit ourself to obvious and most useful ones.
Overrides should be keyword-only.

def middleware(request):

router = request.app.router
match_info = yield from router.resolve(request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request already has match_info, no need to resolve it twice

@codecov-io
Copy link

codecov-io commented Aug 31, 2016

Codecov Report

Merging #1118 into master will increase coverage by -0.09%.

@@            Coverage Diff             @@
##           master    #1118      +/-   ##
==========================================
- Coverage   98.59%   98.51%   -0.09%     
==========================================
  Files          30       32       +2     
  Lines        7055     7250     +195     
  Branches     1176     1203      +27     
==========================================
+ Hits         6956     7142     +186     
- Misses         58       62       +4     
- Partials       41       46       +5
Impacted Files Coverage Δ
aiohttp/web_middlewares.py 100% <100%> (ø)
aiohttp/web.py 100% <100%> (ø)
aiohttp/web_ws.py 94.81% <ø> (-2.62%)
aiohttp/client_ws.py 98.37% <ø> (-1.63%)
aiohttp/client_reqrep.py 97.89% <ø> (-0.55%)
aiohttp/server.py 99.48% <ø> (-0.52%)
aiohttp/web_reqrep.py 99.45% <ø> (-0.01%)
aiohttp/client.py 100% <ø> (ø)
aiohttp/cookiejar.py 100% <ø> (ø)
aiohttp/test_utils.py 100% <ø> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1811685...b5bde12. Read the comment docs.

setup.cfg Outdated
@@ -1,8 +1,12 @@
[pep8]
max-line-length=80
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asvetlov
Copy link
Member

Looks pretty good but please convert middlewares.normalize_path_middleware into a factory.

I have a guts feeling: sometimes we'll need to add some parameters to the middleware, e.g.:

app.middlewares.append(middlewares.normalize_path_middleware(add_trailing=True, remove_trailing=False))

All params should be keyword-only arguments.

Right now it's not an issue but please respect my guts :)

Also middlewares.normalize_path_middleware looks redundant, just middlewares.normalize_path should be ok.

@argaen
Copy link
Member Author

argaen commented Sep 1, 2016

Yeah! In my current branch I have it as a factory but haven't commited it yet until I have it working with both args (merge_slashes and append_slash).

Ok with keyword only and Ok with renaming :).

In this version both append_slash and merge_slash features are implemented.
The middleware returns as soon as it finds a path that resolves.
@argaen
Copy link
Member Author

argaen commented Sep 1, 2016

With this last commit functionality is covered. Anyway, in case it is ready to merge in your opinion, I would like to keep it until tomorrow night to see if I can find a more readable way to express it.

A bit off topic but that came while working with this merge request:

  • Would it make sense to open a new issue for implementing an alternative router.resolves? Current implementation passing the Request object and then when return having to check with isinstance IMHO is not clean. It would be better to just send method and path (as you already started in your pull request) and then agree on some syntax to make it easier to check if resolved or not.
  • Every time a request is resolved, the request._match_info has to be filled automatically. Would it make sense to do that inside the resolve method?

@argaen argaen changed the title [WIP] Added first version of normalize_path_middleware Added first version of normalize_path_middleware Sep 2, 2016
@argaen
Copy link
Member Author

argaen commented Sep 2, 2016

Finished. We still have a caveat that I don't know how (or if) you want to address it.

Suppose the user has the following route defined: /resource.

If the user tries to access to /resource/, it will return a 404 no matter if the middleware is active or not. Right now we only deal with cases where we want to append trailing slash.

@argaen argaen force-pushed the issue-355/normalize_path_middleware branch from 7c94a7a to 764781e Compare September 2, 2016 15:05
@argaen
Copy link
Member Author

argaen commented Sep 4, 2016

I think this would also solve #682

@@ -1,8 +1,12 @@
[pep8]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add this option? 79 lines is default value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People can have configurations at user home level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Ok, make sense.

@asvetlov
Copy link
Member

asvetlov commented Sep 9, 2016

Please drop autodoc markup from docstrings but explicitly document .clone() and middleware in web_reference.rst.

Also middleware should be described in web.rst.

@asvetlov
Copy link
Member

asvetlov commented Sep 9, 2016

Rename the module to web_middlewares, import it in web.py and make accessible in web.py's __all__.
The following should work:

from aiohttp import web
web.normalize_path_middleware

return False, request


def normalize_path(
Copy link
Member

@asvetlov asvetlov Sep 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to normalize_path_middleware. The name is long but it will be used only in one place of program.
Adding middleware suffix explicitly describes the function job.

@cynecx
Copy link

cynecx commented Oct 14, 2016

Is the clone function really necessary? As it may impose a performance issue if the server is under a huge amount of load. In my opinion a better approach would be to provide an overload function for UrlDispatcher::resolve, so it basically accepts (method, raw_path) which can be obtained by calling the appropriate member functions on a request object (Or we have to manually iterate over resources and call resolve which is again better than actually cloning the whole request object).

Also I am wondering if it's actually better to directly integrate this feature in aiohttp's core (One can enable this feature by a flag), so we can actually drop returning HTTPMovedPermanently.

For example the expected behavior for this would be:

A request something like this Req { method="GET", path="///account//info/" } would be normalized to Req { method="GET", path="/account/info/" }, then we will actually try to find an appropriate handler, this way we are avoiding returning a HTTPMovedPermanently.

Again, I think we we should do the normalization as early as possible to elimination possible performance (More like we are being repetitive).

@cynecx
Copy link

cynecx commented Oct 14, 2016

I was working on an temporary approach (More like a hack) where I utilize a middleware which does the normalization then crafts a temporary object where raw_path and method are stored. Instead of returning a HTTPMovedPermanently, the middleware is actually trying to call the appropriate handler. This is the time where I noticed something, this approach might not work with expect_handlers because middlewares aren't applied here (https://github.com/KeepSafe/aiohttp/blob/master/aiohttp/web.py#L81), so I guess this won't work without actually changing internal code.

@argaen
Copy link
Member Author

argaen commented Oct 14, 2016

@cynecx I have this MR a bit abandoned because Im spending my time with other stuff right now. About what you comment, I remember my first approach trying what you said about not using the HTTPMovedPermanently but I found out something (maybe same as you) that made me go with the redirect.

On the other hand, I totally agree that an alternative resolve method where only the method and path are sent should exist.

@asvetlov
Copy link
Member

asvetlov commented Nov 3, 2016

.clone() is implemented by #1361

@AraHaan
Copy link
Contributor

AraHaan commented Dec 24, 2016

Wow looks interresting be nice if this PR could be updated so it can be Merged.

@sloria
Copy link
Contributor

sloria commented Jan 21, 2017

This is a useful middleware. And if this gets integrated, I could get rid of my currently-broken implementation in aiohttp_utils. 👍

@argaen
Copy link
Member Author

argaen commented Jan 22, 2017

I've upgraded the branch to resolve conflicts and apply some comments from @asvetlov. I'm having some problems from 5 test cases that didn't have before:

self = <aiohttp.test_utils.TestServer object at 0x7f1108c474a8>, path = '//resource2//a//b/'

    def make_url(self, path):
        url = URL(path)
>       assert not url.is_absolute()
E       assert not True
E        +  where True = <bound method URL.is_absolute of URL('//resource2//a//b/')>()
E        +    where <bound method URL.is_absolute of URL('//resource2//a//b/')> = URL('//resource2//a//b/').is_absolute

I can change the test cases and instead of //resource2//a//b/ use ///resource2//a//b/ but feels like cheating.

Also, I'm missing the documentation in web_reference.rst. Do you want me to copy paste the docstring there or include it or?

paths_to_check.append(request.path + '/')
if merge_slashes and append_slash:
paths_to_check.append(
re.sub('//+', '/', request.path + '/'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think re.sub should be compiled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 2, 2017

Create new section in web_reference "Middlewares", and add you doctoring.

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 2, 2017

why '//resource2//a//b/' this test fails?

@argaen
Copy link
Member Author

argaen commented Feb 2, 2017

Because of the following:

tests/test_web_middleware.py:184: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
aiohttp/client.py:530: in __iter__
    resp = yield from self._coro
aiohttp/test_utils.py:249: in request
    method, self.make_url(path), *args, **kwargs
aiohttp/test_utils.py:237: in make_url
    return self._server.make_url(path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <aiohttp.test_utils.TestServer object at 0x7fdf41c7a630>, path = '//resource2//a//b/'

    def make_url(self, path):
        url = URL(path)
>       assert not url.is_absolute()
E       assert not True
E        +  where True = <bound method URL.is_absolute of URL('//resource2//a//b/')>()
E        +    where <bound method URL.is_absolute of URL('//resource2//a//b/')> = URL('//resource2//a//b/').is_absolute

aiohttp/test_utils.py:82: AssertionError

If the url is absolute its not a valid url to query I don't know why. This is a behavior that was introduced with yarl because I don't remember this happening before.

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 2, 2017

i see. is it possible to skip make_url call with some param and construct it manually?

@argaen
Copy link
Member Author

argaen commented Feb 3, 2017

I have no idea, the code for the make_url is

    def make_url(self, path):
        url = URL(path)
        assert not url.is_absolute()
        return self._root.join(url)

which doesn't accept extra parameter. I believe it could be a yarl bug (or feature). Out of scope of this MR IMO.

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 3, 2017

ok, i will fix that

@fafhrd91 fafhrd91 merged commit 62d7752 into aio-libs:master Feb 3, 2017
@argaen argaen deleted the issue-355/normalize_path_middleware branch February 3, 2017 16:12
@fafhrd91
Copy link
Member

fafhrd91 commented Feb 3, 2017

@argaen i added tests that you deleted. some of them actually failing.
https://github.com/KeepSafe/aiohttp/tree/normalize-middleware-fixes

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 3, 2017

@argaen i need you to review failing tests

@fafhrd91
Copy link
Member

fafhrd91 commented Feb 3, 2017

#1583

@lock
Copy link

lock bot commented Oct 29, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 29, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants