You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
While working on the unified header validation component (#20261), we found that the Host and Authority headers do not decode percent-encoded UTF8 characters, per the RFC spec.
Although the fix could be targeted for UHV, I wanted to register this issue with the community to get consensus on how percent-encoded characters should be handled within the H1 Host and H2 :authority headers. For now, we are only looking at the Host and :authority headers and not talking about URI or path normalization.
Some initial options after reading the RFCs, which could be implemented as new configuration settings:
Keep the current behavior and verify that Envoy users can register services that match on percent-encoded host/authority.
Decode all percent-encoded characters from Host and :authority, verify they are valid UTF8 codepoints, and re-encode them in the upstream request (where appropriate).
The URI RFC says that clients producing URIs should only encode non-ASCII characters in this way. Envoy could enforce this by also verifying that the decoded UTF8 codepoint is outside the ASCII range.
This could also be done on a per-service configuration basis (e.g.- decode_authority = [true|false]
The reg-name syntax allows percent-encoded octets in order to represent non-ASCII registered names in a uniform way that is independent of the underlying name resolution technology. Non-ASCII characters must first be encoded according to UTF-8 (STD 63), and then each octet of the corresponding UTF-8 sequence must be percent-encoded to be represented as URI characters.
URI producing applications must not use percent-encoding in host unless it is used to represent a UTF-8 character sequence.
The authority component within the URI is used by both H1 Host header and H2 :authority header:
A client MUST send a Host header field in all HTTP/1.1 request messages. If the target URI includes an authority component, then a client MUST send a field-value for Host that is identical to that authority component, excluding any userinfo subcomponent and its @ delimiter.
The :authority pseudo-header field includes the authority portion of the target URI (RFC 3986, Section 3.2). The authority MUST NOT include the deprecated userinfo subcomponent for http or https schemed URIs.
The text was updated successfully, but these errors were encountered:
Title: Host and Authority Headers RFC Compliance: Decode Percent-encoded UTF8 Characters
Description:
While working on the unified header validation component (#20261), we found that the
Host
andAuthority
headers do not decode percent-encoded UTF8 characters, per the RFC spec.Although the fix could be targeted for UHV, I wanted to register this issue with the community to get consensus on how percent-encoded characters should be handled within the H1
Host
and H2:authority
headers. For now, we are only looking at theHost
and:authority
headers and not talking about URI or path normalization.Some initial options after reading the RFCs, which could be implemented as new configuration settings:
Host
and:authority
, verify they are valid UTF8 codepoints, and re-encode them in the upstream request (where appropriate).decode_authority = [true|false]
Relevant Links:
authority
component within the URI is used by both H1Host
header and H2:authority
header:Host
:The text was updated successfully, but these errors were encountered: