Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ignoring invalid header lines (fixes #61, #83) #114

Merged
merged 2 commits into from
May 19, 2022

Conversation

nox
Copy link
Contributor

@nox nox commented Apr 22, 2022

No description provided.

@seanmonstar
Copy link
Owner

It'd probably be better to have the separate commits as different PRs, as that would make the review panel a bit clear-er... But I looked at each commit individually, and I think it looks right.

@acfoltzer wanna take a look?

@acfoltzer
Copy link
Contributor

@seanmonstar sure thing, can you tag me as a reviewer so I don't lose track? Should be able to dig in early next week if that's alright

@nox
Copy link
Contributor Author

nox commented Apr 25, 2022

There is an actual bug now in httparse which makes invalid HTTP requests be accepted by the crate, could we hurry to get a new release?

@Coder-256
Copy link

I wonder if the existing behavior is vulnerable to response splitting, e.g. GET / HTTP/1.1\n\n: Foo\n\nBody; if config.allow_spaces_after_header_name_in_responses is enabled, would : Foo be parsed as a header instead of part of the body?

@acfoltzer
Copy link
Contributor

There is an actual bug now in httparse which makes invalid HTTP requests be accepted by the crate, could we hurry to get a new release?

@nox would it be possible to split the invalid HTTP request fix from the new option? That would make it easier to review quickly. I will start taking a look at dcb18ab.

@nox
Copy link
Contributor Author

nox commented Apr 25, 2022

Will do, but there isn't anything in this PR that should be long to review 🤷🏻‍♂️

@nox
Copy link
Contributor Author

nox commented Apr 25, 2022

#115

@nox nox changed the title Fix a bug and add an option to ignore invalid header lines (fixes #61, #83) Allow ignoring invalid header lines (fixes #61, #83) Apr 25, 2022
@nox
Copy link
Contributor Author

nox commented Apr 25, 2022

I wonder if the existing behavior is vulnerable to response splitting, e.g. GET / HTTP/1.1\n\n: Foo\n\nBody; if config.allow_spaces_after_header_name_in_responses is enabled, would : Foo be parsed as a header instead of part of the body?

No, because the check for the first char being a valid header name char is separate, but now that you say this, I just realised I forgot to handle the first char being invalid, in this PR. I guess there was some nasty stuff to review after all hah!

@acfoltzer
Copy link
Contributor

Thank you!

Will do, but there isn't anything in this PR that should be long to review 🤷🏻‍♂️

I want to make sure we have time to carefully understand the changes being made to the parser, and give some of my fuzzers time to chug on the new branch. After all, the impact if we get something wrong is pretty significant.

Copy link
Contributor

@acfoltzer acfoltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for spinning off #115, I've left my review over there.

As for the new proposed feature ignore_invalid_header_lines_in_responses in 7244d92, I have some concerns before we move ahead.

Overall thoughts

Beside the two specific areas of concern described below, I'm also wondering whether this level of permissiveness belongs in this API. The interaction with obs-fold was one that happened to occur to me as I read the patch, but it would not surprise me if there are other potential interactions that could lead to undesirable behavior just within the parser, let alone downstream effects from clients or servers observing incomplete sets of headers without awareness of the ignored field(s).

The other options we support for parsers at least have precedent in current or past versions of the HTTP/1 specs, but the proposed behavior steps entirely outside of those. Can you share more about the use case you have in mind for this? Are there clients and servers emitting such messages in practice today? Is this a hedge against data corruption?

Even with a compelling use case, this flag should be very heavily caveated in documentation along with specific descriptions of the cases where its use is required.

Scope of the new option

If we move ahead with adding this flag, the name should be changed and documentation added to more narrowly describe the behavior, or the implementation should expand to also allow ignoring errors that arise during parsing of the header value. In the current implementation, parse failures only lead to an ignored line if they occur at or before the colon separating the name from the value. Other errors that can arise after this point are not ignored. For example, modifying one of the new tests in 7244d92 to put the invalid character in the value does not yield the same behavior:

static RESPONSE_WITH_INVALID_CHAR_IN_HEADER_VALUE: &'static [u8] =
    b"HTTP/1.1 200 OK\r\nAccess-Control-Allow-Credentials: tr\x0Bue\r\nBread: baguette\r\n\r\n";

#[test]
fn test_ignore_header_line_with_invalid_char_in_value() {
    let mut headers = [EMPTY_HEADER; 2];
    let mut response = Response::new(&mut headers[..]);
    let result = ::ParserConfig::default()
        .allow_spaces_after_header_name_in_responses(true)
        .ignore_invalid_header_lines_in_responses(true)
        .parse_response(&mut response, RESPONSE_WITH_INVALID_CHAR_IN_HEADER_VALUE);

    // This assertion fails:
    assert_eq!(result, Ok(Status::Complete(81)));
    assert_eq!(response.version.unwrap(), 1);
    assert_eq!(response.code.unwrap(), 200);
    assert_eq!(response.reason.unwrap(), "OK");
    assert_eq!(response.headers.len(), 1);
    assert_eq!(response.headers[0].name, "Bread");
    assert_eq!(response.headers[0].value, &b"baguette"[..]);
}

Interaction with obs-fold

Ignoring up to the following newline does not mix well with allow_obsolete_multiline_headers_in_responses:

static RESPONSE_WITH_OBSOLETE_LINE_FOLDING_AND_INVALID_NAME: &'static [u8] =
    b"HTTP/1.1 200 OK\r\nLine-Folded-Header\xFF: hello  \r\n \r\n there\r\nBread: baguette\r\n";

/// This test passes; obs-fold is accepted but the invalid header name is not
#[test]
fn test_forbid_response_with_obsolete_line_folding_in_middle_and_invalid_name() {
    let mut headers = [EMPTY_HEADER; 1];
    let mut response = Response::new(&mut headers[..]);
    let result = ::ParserConfig::default()
        .allow_obsolete_multiline_headers_in_responses(true)
        .parse_response(&mut response, RESPONSE_WITH_OBSOLETE_LINE_FOLDING_AND_INVALID_NAME);

    assert_eq!(result, Err(::Error::HeaderName));
}

/// This should pass if we're ignoring the invalid folded line, but does not
#[test]
fn test_ignore_folded_line_with_invalid_name() {
    let mut headers = [EMPTY_HEADER; 1];
    let mut response = Response::new(&mut headers[..]);
    let result = ::ParserConfig::default()
        .allow_obsolete_multiline_headers_in_responses(true)
        .ignore_invalid_header_lines_in_responses(true)
        .parse_response(&mut response, RESPONSE_WITH_OBSOLETE_LINE_FOLDING_AND_INVALID_NAME);

    assert_eq!(
        result,
        Ok(Status::Complete(
            RESPONSE_WITH_OBSOLETE_LINE_FOLDING_IN_MIDDLE.len()
        ))
    );
    assert_eq!(response.version.unwrap(), 1);
    assert_eq!(response.code.unwrap(), 200);
    assert_eq!(response.reason.unwrap(), "OK");
    assert_eq!(response.headers.len(), 1);
    assert_eq!(response.headers[0].name, "Bread");
    assert_eq!(response.headers[0].value, &b"baguette"[..]);
}

I don't believe this would lead to a situation where the second line of a folded line could be confused for a legit field-line all on its own, as a valid folded line must be preceded by RWS whitespace. But it's close enough that I would want to heavily scrutinize any logic that determines when the ignored header line is truly over.

@nox
Copy link
Contributor Author

nox commented Apr 26, 2022

Beside the two specific areas of concern described below, I'm also wondering whether this level of permissiveness belongs in this API.

Yes it does, browsers do that so httparse should to.


Will modify my patch to take care of errors in values.


The interaction with obsolete line folding is a non-issue because additional lines must start with whitespace, and that's never considered a header name. I'll add a test which involves an invalid header that uses obsolete line folding though.

@nox
Copy link
Contributor Author

nox commented Apr 26, 2022

I also noticed yesterday that I didn't take care of header names whose very first char is invalid.

@nox
Copy link
Contributor Author

nox commented Apr 26, 2022

I've studied Chromium's code a bit more and the one thing it rejects is NUL chars. Everything else is ok.

@nox nox force-pushed the ignore-invalid-headers branch 2 times, most recently from cbc6f1f to 67a2898 Compare April 26, 2022 13:54
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Show resolved Hide resolved
@acfoltzer
Copy link
Contributor

acfoltzer commented Apr 26, 2022

Yes it does, browsers do that so httparse should to.

D'oh, of course browsers allow this 😞

The interaction with obsolete line folding is a non-issue because additional lines must start with whitespace, and that's never considered a header name. I'll add a test which involves an invalid header that uses obsolete line folding though.

It looks like the new tests with folding address the problems I had in the original patch, nice!

I've studied Chromium's code a bit more and the one thing it rejects is NUL chars. Everything else is ok.

I see that there's now more of a description of behavior in the doc comment, thank you! I think it'd be worth enumerating a few more details there, like how it determines the end of the line to ignore, and how it interacts with folding and possibly the other options. Not quite to the level of an RFC with ABNF etc, but it would be nice to be very specific about what this allows.

@nox nox force-pushed the ignore-invalid-headers branch 2 times, most recently from 81cf158 to e19ef40 Compare April 27, 2022 08:56
I added a macro `maybe_continue_after_obsolete_line_folding`, I simplified
the check macro used with `next_8`, and I reduced nesting in the
`'whitespace_after_colon` loop.
@nox nox force-pushed the ignore-invalid-headers branch from e19ef40 to a3bd3ae Compare April 27, 2022 08:58
Copy link
Contributor

@acfoltzer acfoltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple small documentation tweaks, but then the remaining issue is whether to be more defensive about making this only apply to responses as described in #114 (comment)

src/lib.rs Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@nox nox force-pushed the ignore-invalid-headers branch from a3bd3ae to 413475b Compare May 6, 2022 12:47
@nox
Copy link
Contributor Author

nox commented May 6, 2022

@acfoltzer Improved the docs and added a test for request parsing.

@nox nox force-pushed the ignore-invalid-headers branch from 413475b to 808963b Compare May 6, 2022 12:48
Copy link
Contributor

@acfoltzer acfoltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me now, thank you! As a followup, I will try to think about how best to make requests fully configurable without breaking the current response-only flags.

@seanmonstar
Copy link
Owner

Thanks to both of you for working through this! <3

@seanmonstar seanmonstar merged commit 0e7d95c into seanmonstar:master May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants