fix: HTTP 308 Permanent Redirect status code handling #2389
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary of changes
Change the handling of HTTP status code 308 to behave more like
urllib.request.HTTPRedirectHandler
, most critically, the new 308 handling will create a newurllib.request.Request
object with the new URL, which will prevent state from being carried over from the original request.One case where this is important is when the domain name changes, for example, when the original URL is
http://www.w3.org/ns/adms.ttl
and the redirect URL ishttps://uri.semic.eu/w3c/ns/adms.ttl
. With the previous behaviour, the redirect would contain aHost
header with the valuewww.w3.org
instead ofuri.semic.eu
, because theHost
header is placed inRequest.unredirected_hdrs
and takes precedence over theHost
header inRequest.headers
.Other changes:
Only handle HTTP status code 308 on Python versions before 3.11 as Python 3.11 will handle 308 by default [ref].
Move code which uses
http://www.w3.org/ns/adms.ttl
andhttp://www.w3.org/ns/adms.rdf
out oftest_guess_format_for_parse
into a separate parameterized test which instead uses the embedded http server.This allows the test to fully control the
Content-Type
header in the response instead of relying on the value that the server is sending.This is needed because the server is sending
Content-Type: text/plain
for theadms.ttl
file, which is not a valid RDF format, and the test is expectingContent-Type: text/turtle
.Fixes:
Checklist
the same change.
so maintainers can fix minor issues and keep your PR up to date.