Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endless loop in sessions.py #2231

Closed
fcosantos opened this issue Sep 18, 2014 · 6 comments · Fixed by #2244
Closed

Endless loop in sessions.py #2231

fcosantos opened this issue Sep 18, 2014 · 6 comments · Fixed by #2244

Comments

@fcosantos
Copy link

(gdb) py-list
544 # Guard against that specific failure case.
545 if not isinstance(request, PreparedRequest):
546 raise ValueError('You can only send PreparedRequests.')
547
548 while request.url in self.redirect_cache:
-549 request.url = self.redirect_cache.get(request.url)
550
551 # Set up variables needed for resolve_redirects and dispatching of hooks
552 allow_redirects = kwargs.pop('allow_redirects', True)
553 stream = kwargs.get('stream')
554 timeout = kwargs.get('timeout')

redirect_cache dict could have complementary key/values that forces and enless loop.

(gdb) py-down

2 Frame 0x7f645c00acd0, for file /test/libs/requests/sessions.py, line 549, in send (self=<Session(cookies=<RequestsCookieJar(_now=1410944130, _policy=<DefaultCookiePolicy(strict_rfc2965_unverifiable=True, strict_ns_domain=0, _allowed_domains=None, rfc2109_as_netscape=None, rfc2965=False, strict_domain=False, _now=1410944130, strict_ns_set_path=False, strict_ns_unverifiable=False, strict_ns_set_initial_dollar=False, hide_cookie2=False, _blocked_domains=(), netscape=True) at remote 0x7f651070c4d0>, _cookies={}, _cookies_lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f6530772dd0>, _RLock__count=0) at remote 0x7f65105a5450>) at remote 0x7f65105a5e10>, stream=False, hooks={'response': []}, redirect_cache={'http://www.hent aiwe blog.com/': 'http://hent aiwe blog.com/', 'http://hent aiwe blog.com/': 'http://www.hent aiwe blog.com/'}, auth=None, trust_env=True, headers=<CaseInsensitiveDict(_store={'accept-encoding': ('Accept-Encoding', 'gzip, deflate'), 'accep...(truncated)

request.url = self.redirect_cache.get(request.url)

Current request chain with wget:

$ wget www.hent aiwe blog.com
--2014-09-18 11:52:20-- http://www.hent aiwe blog.com/
Resolving www.hent aiwe blog.com (www.hent aiwe blog.com)... 109.72.81.172
Connecting to www.hent aiwe blog.com (www.hent aiwe blog.com)|109.72.81.172|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://hent aiwe blog.com/ [following]
--2014-09-18 11:52:20-- http://hent aiwe blog.com/
Resolving hent aiwe blog.com (hent aiwe blog.com)... 109.72.81.172
Reusing existing connection to www.hent aiwe blog.com:80.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.hent aiwe blog.com/ [following]
--2014-09-18 11:52:20-- http://www.hent aiwe blog.com/
Reusing existing connection to www.hent aiwe blog.com:80.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://hent aiwe blog.com/ [following]
...

Suggested resolution:

    redirect_count = 0
    while request.url in self.redirect_cache:
        redirect_count += 1
        if redirect_count > self.max_redirects:
            raise TooManyRedirects
        request.url = self.redirect_cache.get(request.url)
@Lukasa
Copy link
Member

Lukasa commented Sep 18, 2014

Please search for issues before raising new ones. This was raised in #2207 and fixed in #2210: the fix is already in master.

@Lukasa Lukasa closed this as completed Sep 18, 2014
@Lukasa Lukasa reopened this Sep 18, 2014
@Lukasa
Copy link
Member

Lukasa commented Sep 18, 2014

I apologise, we didn't quite fix this. We actually have a separate bug, in that we can have a redirect loop of size greater than 1.

Ideally we ought to be able to spot redirect loops, regardless of whether they use the redirect cache.

@sigmavirus24
Copy link
Contributor

So this is a pretty serious problem. We could do:

checked_urls = set()
while request.url in self.redirect_cache:
    checked_urls.add(request.url)
    new_url = self.redirect_cache.get(request.url)
    if new_url in checked_urls:
        break
    request.url = new_url

The break will mean that we use the existing request URL which should, ideally, trigger a TooManyRedirects exception, ideally this will prevent us from having to duplicate logic for that exception here. I have not tested this yet though.

The idea is that if we followed a redirect in the cache to something else, it will find that in checked_urls. This way if the redirect cache reads more like

A -> B -> C -> D -> E -> F -
^                            \
|                            /
-----------------------------

We will catch it as soon as F redirects to A. We could (subclassing TooManyRedirects) raise a RedirectLoopFound exception too from here as soon as we catch this instead of forcing the user to wait for 30 redirects to happen (or whatever I max currently is).

@RuudBurger
Copy link

I'm having the same issue, just wanted to chip in. For a faulty url that loops to itself, the max_redirects counter never goes beyond 1.
The self.redirect_cache looks like this:
{'http://www.hostname.com/': 'http://hostname.com/', 'http://hostname.com/': 'http://www.hostname.com/'}
The www redirects to the non-www, and visa versa.
Only thing that fixes it now is setting the max_redirects to 1.

@RuudBurger
Copy link

The patch mentioned above works for me btw. CouchPotato/CouchPotatoServer@3338b72

@sigmavirus24
Copy link
Contributor

PR #2244 has a fix for this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants