Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What characters are permitted #846

Merged
merged 7 commits into from
Jun 3, 2021
Merged
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 47 additions & 14 deletions draft-ietf-httpbis-http2bis.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2847,10 +2847,42 @@
registered HTTP fields, see the "Hypertext Transfer Protocol (HTTP) Field Name Registry" registry maintained at <eref target="https://www.iana.org/assignments/http-fields/"/>.
</t>
<t>
Just as in HTTP/1.x, field names are strings of ASCII characters that are
compared in a case-insensitive fashion. Field names MUST be converted
to lowercase prior to their encoding in HTTP/2. A request or response containing
uppercase header field names MUST be treated as <xref target="malformed">malformed</xref>.
Field names are strings of ASCII characters that are compared in a case-insensitive
fashion. Field names MUST be converted to lowercase when constructing a HTTP/2
message. A request or response containing an uppercase character ('A' to 'Z', ASCII 0x41
to 0x5a) in a field name MUST be treated as <xref target="malformed">malformed</xref>.
</t>
<t>
HPACK is capable of carrying field names or values that are not valid in HTTP. Though
HPACK can carry any octet, fields are not valid in the following cases:
</t>
Comment on lines +2850 to +2858
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Field names are strings of ASCII characters that are compared in a case-insensitive
fashion. Field names MUST be converted to lowercase when constructing a HTTP/2
message. A request or response containing an uppercase character ('A' to 'Z', ASCII 0x41
to 0x5a) in a field name MUST be treated as <xref target="malformed">malformed</xref>.
</t>
<t>
HPACK is capable of carrying field names or values that are not valid in HTTP. Though
HPACK can carry any octet, fields are not valid in the following cases:
</t>
Field names are strings of ASCII characters that are compared in a case-insensitive
fashion. Field names MUST be converted to lowercase when constructing a HTTP/2
message. A request or response containing an uppercase character ('A' to 'Z', ASCII 0x41
to 0x5a) in a field name MUST be treated as <xref target="malformed">malformed</xref>.
</t>
<t>
HPACK is capable of carrying field names or values that are not valid in HTTP, thus endpoints
MUST perform some additional validation and treat as <xref target="malformed">malformed</xref>
that are not <xref target="PseudoHeaderFields">pseudo-header fields</xref> and that do not comply
with the validation. An endpoint SHOULD, in addition to the lowercase condition above, validate
fields against the HTTP ABNF grammar from <xref target="HTTP" section="5"/>. Any endpoint that
does not validate against the HTTP ABNF grammar MUST at least validate against the following cases:
</t>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving aside the grammar of the text for a moment (I only want to discuss that if we decide to go ahead with adding it), I think the core question is whether or not this is a SHOULD. If -semantics does not make this a SHOULD, why should we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validating the HTTP fields in -semantics is a MUST, but I'm not proposing that here. This change works equally as well with MAY instead of SHOULD. The point being making it clear that the conditions listed below are a minimum validation and that more strict validation may be done instead.

Of course it is yet clearer with ABNF:

Suggested change
Field names are strings of ASCII characters that are compared in a case-insensitive
fashion. Field names MUST be converted to lowercase when constructing a HTTP/2
message. A request or response containing an uppercase character ('A' to 'Z', ASCII 0x41
to 0x5a) in a field name MUST be treated as <xref target="malformed">malformed</xref>.
</t>
<t>
HPACK is capable of carrying field names or values that are not valid in HTTP. Though
HPACK can carry any octet, fields are not valid in the following cases:
</t>
Field names are strings of ASCII characters that are compared in a case-insensitive
fashion. Field names MUST be converted to lowercase when constructing a HTTP/2
message. A request or response containing an uppercase character ('A' to 'Z', ASCII 0x41
to 0x5a) in a field name MUST be treated as <xref target="malformed">malformed</xref>.
</t>
<t>
Since HPACK is capable of carrying field names or values that are neither valid in HTTP
nor secure to transmit as HTTP, fields MUST at least be minimally validated by treating as
<xref target="malformed">malformed</xref> any field that does not comply with the ABNF:
<pre>
field-name = h2-pseudo-field-name / 1*fn-char
pseudo-field-name = ":" 1*fn-char
fn-char = %x21-39 / %x3b-40 / %x5B-7E ; lowercase visible symbols except ':'
field-value = 1*fv-char
fv-char = %x01-09 / %x0b-0c / %x0e-ff ; octets excluding null, CR and LF
</pre>
Alternately, fields MAY be more strictly validated and treated as
<xref target="malformed">malformed</xref> if they do not comply with the
HTTP ABNF grammar from <xref target="HTTP" section="5"/> modified for
lowercase field names:
<pre>
field-name = h2-pseudo-field-name / 1*fn-char
pseudo-field-name = ":" 1*fn-char
fn-char = "!" / "#" / "$" / "%" / "&amp;" / "'" / "*" / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" / DIGIT / %x61-7A
field-value = *field-content
field-content = field-vchar [ 1*( SP / HTAB / field-vchar ) field-vchar ]
field-vchar = VCHAR / %x80-FF
</pre>
</t>

then delete all the bullet points below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validating the HTTP fields in -semantics is a MUST, but I'm not proposing that here.

Is it? I can’t find that text in the document, can you link me to it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In section 2.4 it says:

A recipient MUST interpret a received protocol element according to the semantics defined for it by this specification

The ABNF is given for the fields in section 5., so by my reading that would be a protocol element that MUST be interpreted according to the ABNF defined by the specification.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree with the reading, but section 2.4 does then caveat that MUST pretty substantially. For example, the sentence you quotes reads, in its entirety:

A recipient MUST interpret a received protocol element according to the semantics defined for it by this specification, including extensions to this specification, unless the recipient has determined (through experience or configuration) that the sender incorrectly implements what is implied by those semantics.

On top of that unless, the next paragraph is:

Unless noted otherwise, a recipient MAY attempt to recover a usable protocol element from an invalid construct.

I don't think this drastically changes the MUST, but I do think that we potentially want to walk the SHOULD back to a MAY here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa, which "SHOULD" are you referring to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinthomson I believe @Lukasa is referring to the "SHOULD" in the first version of suggested text above:

An endpoint SHOULD, in addition to the lowercase condition above, validate fields against the HTTP ABNF grammar ...
If text like that is adopted, I don't particularly care if that is a SHOULD or MAY. The key thing I suggestion is that this document explicitly calls out the possibility of validating against the full HTTP grammer in this "HttpFields" section and indicates that the minimal validation described is just that - a minimal validation.

However, I prefer the form of my second suggested text above:

Since HPACK is capable of carrying field names or values that are neither valid in HTTP nor secure to transmit as HTTP, fields MUST at least be minimally validated ...
Alternately, fields MAY be more strictly validated ... with the HTTP ABNF grammar ...

Finally, althought I prefer the ABNF usage in my second suggested text (specially as it clarifies the lowercase situation), but even if a text description is used I think that casn work with the form: "MUST minimally validate... Alternatively MAY more strictly validate...."

<ul>
<li>
A field name MUST NOT contain characters in the range 0x00-0x20 (inclusive) or 0x7F
(ASCII DEL). This includes all ASCII control characters, plus ASCII SP (0x20).
</li>
<li>
With the exception of <xref target="PseudoHeaderFields">pseudo-header fields</xref>,
which have a name that starts with a single colon, field names MUST NOT include a
colon (ASCII COLON, 0x3a).
</li>
<li>
A field value MUST NOT contain the zero value (ASCII NUL, 0x0), line feed (ASCII LF,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to repeat a requirement from above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it's also not clear to me why we would allow whitespace inside the field name...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does it say those things?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not. I was confused because of the statement about field values. Sorry.

0xa), or carriage return (ASCII CR, 0xd) at any position.
</li>
<li>
A field value MUST NOT start or end with an ASCII whitespace character (ASCII SP or
HTAB, 0x20 or 0x9).
</li>
</ul>
mnot marked this conversation as resolved.
Show resolved Hide resolved
<t>
A request or response that contains a field that violates any of these conditions MUST
be treated as <xref target="malformed">malformed</xref>.
</t>
<t>
Field values that are not valid according to the definition of the corresponding field
do not cause a request to be <xref target="malformed" format="none">malformed</xref>
except as defined by the processing rules for the field.
</t>
<section anchor="PseudoHeaderFields">
<name>Pseudo-Header Fields</name>
Expand Down Expand Up @@ -3707,18 +3739,19 @@
<section>
<name>Intermediary Encapsulation Attacks</name>
<t>
The HTTP/2 field block encoding allows the expression of names that are not valid field
names in HTTP. Requests or responses containing
invalid field names MUST be treated as <xref target="malformed">malformed</xref>.
An intermediary therefore cannot translate an HTTP/2 request or response containing an
invalid field name into an HTTP/1.1 message.
HPACK permits encoding of field names and values that might be treated as delimiters in
other HTTP versions. An intermediary that translates an HTTP/2 request or response MUST
validate fields according to the rules in <xref target="HttpHeaders"/> roles before
translating a message to another HTTP version. Translating a field that includes invalid
delimiters could be used to cause recipients to incorrectly interpret a message, which
could be exploited by an attacker.
</t>
<t>
Similarly, HTTP/2 allows field values that are not valid. While most of the values
that can be encoded will not alter header field parsing, carriage return (CR, ASCII 0xd),
line feed (LF, ASCII 0xa), and the zero character (NUL, ASCII 0x0) might be exploited by
an attacker if they are translated verbatim. Any request or response that contains a
character not permitted in a field value MUST be treated as <xref target="malformed">malformed</xref>. Valid characters are defined by the <tt>field-content</tt> ABNF rule in <xref target="HTTP" section="5.5"/>.
Lukasa marked this conversation as resolved.
Show resolved Hide resolved
An intermediary can reject fields that contain invalid field names or values for other
reasons, in particular those that do not conform to the HTTP ABNF grammar from <xref
target="HTTP" section="5"/>. Intermediaries that do not perform any validation of fields
other than the minimum required by <xref target="HttpHeaders"/> could forward messages
that contain invalid field names or values.
</t>
</section>
<section>
Expand Down