Skip to content

Conversation

@alexciechonski
Copy link

This Pull Request introduces a targeted mitigation in Request.prepare_url to prevent URL corruption caused by a bug in the dependency, urllib3.util.parse_url.

The Problem

When a standards-compliant link-local IPv6 address with a Zone ID is passed to requests (e.g., http://[fe80::a%2553]/), the following sequence of failures occurs:

  1. Premature Decoding (in urllib3): urllib3.util.parse_url incorrectly decodes the Zone ID delimiter from the required URI format (%25) to a single percent sign (%). This leaves the host component in a corrupted state ([fe80::a%53]). It can also be the case mutlple calls of urllib3.quote and urllib3.unquote further change characters after the percent sign to its hexadecimal representation (%53 -> S).
  2. Downstream Failure: This corrupted host string fails to be processed correctly by the rest of the HTTP stack, leading to critical errors:
    • Connection Error: OSError: [Errno 22] Invalid argument in urllib3's socket layer, as the host string is improperly formatted for the OS socket API.
    • Validation Error: In other requests components (like cookie handling), the parser may incorrectly decode the remaining %53 as the character 'S', leading to ValueError.

The Solution

This patch adds logic immediately after the parse_url call to check for, and repair, the corrupted host component.

  1. It checks for bracketed IPv6 addresses containing exactly one single percent sign (%).
  2. If found, it reconstructs the host by restoring the standards-compliant, fully-encoded delimiter (%25).

This ensures that the final self.url is a canonical URI per RFC 6874, allowing the request to proceed successfully once the underlying urllib3 connection logic is fixed to correctly handle the Zone ID. This fix prevents immediate internal corruption errors within requests itself.

corrects host parsing to prevent downstream errors caused by a bug in urllib3.util.parse_url.

This mitigation ensures the URI is canonical per RFC 6874.
@alexciechonski
Copy link
Author

As this is a core connection bug in the urllib3 library, I am raising an issue there and am looking to submit a corresponding PR to fix the underlying socket handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant