Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from http-parser to llhttp #5155

Closed
PiotrSikora opened this issue Nov 28, 2018 · 68 comments
Closed

Switch from http-parser to llhttp #5155

PiotrSikora opened this issue Nov 28, 2018 · 68 comments
Assignees
Labels
area/security no stalebot Disables stalebot from closing an issue priority/high tech debt

Comments

@PiotrSikora
Copy link
Contributor

Apparently, http_parser is abandonware and node.js started work on a switching to llhttp.

cc @alyssawilk @htuch @mattklein123

@htuch
Copy link
Member

htuch commented Nov 29, 2018

One concern around llhttp is it uses https://github.com/indutny/llparse and a bunch of TypeScript. I don't want to go down the language war path, but it's probably worth considering the increased build complexity (Bazelifying llparser and dependencies) and developer impedance matching when making the call. All things being equal, I'm a fan of higher level solutions to parsing, but switching to this is not totally obvious.

@PiotrSikora
Copy link
Contributor Author

Well, the auto-generated llhttp was just added to node.js as the experimental-http-parser, and node.js is/was the only thing keeping the hand-written http-parser on a life-support, so the decision might be forced upon us sooner rather than later... I don't have a strong opinion until then.

As for the extra build-time dependencies, we just need the generated C code, see: nodejs/llhttp#10.

@PiotrSikora PiotrSikora changed the title Consider switching from http_parser to llhttp Consider switching from http-parser to llhttp Nov 29, 2018
@indutny
Copy link

indutny commented Nov 29, 2018

If I could be of any help, please do not hesitate to ask me directly.

@alyssawilk
Copy link
Contributor

@indutny would you mind filing a tracking bug for the migration over on the node.js so interested envoy developers can follow it? If we opt to migrate we'd probably swap shortly after you, and if we don't it's still important to know when node.js stops using http_parser from a security perspective, so it's good to know either way :-)

@mattklein123
Copy link
Member

My opinion on this is that we should take a wait and see approach. TBH, HTTP/1.1 at this layer is basically done. I don't really see what changes we would need to make.

@indutny
Copy link

indutny commented Nov 29, 2018

@alyssawilk nodejs/node#24730 - This might do.

@mattklein123 mattklein123 removed help wanted Needs help! labels Jan 11, 2019
@stale
Copy link

stale bot commented Feb 10, 2019

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Feb 10, 2019
@stale
Copy link

stale bot commented Feb 17, 2019

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions.

@stale stale bot closed this as completed Feb 17, 2019
@htuch htuch reopened this Apr 10, 2019
@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Apr 10, 2019
@htuch htuch added the help wanted Needs help! label Apr 10, 2019
@htuch
Copy link
Member

htuch commented Apr 10, 2019

Reopening in light of #6434. Since we didn't get any response from Node.JS security WG when reporting, it seems that http-parser should be considered unsupported. The community did drive a fix when we went public, see nodejs/http-parser#468.

Since http-parser is a codec and hence exposed to untrusted traffic from clients and backends on the data plane, it has a higher standard to live up to than other external dependencies. I would like us to consider only including dependencies that have active security teams that are at least doing CVE management. This is further motivation for moving away from http-parser.

@indutny
Copy link

indutny commented Apr 10, 2019

@htuch I'm sorry about the negative experience that you had. I wish we would have seen your security report on our channels before you submitted that github issue. What channel did you use? Was it HackerOne or the mailing list (security@nodejs.org)? I'm having a hard time finding your report...

I am part of Node.js community and one of the maintainers of http-parser. I believe that in our shared effort we managed to fix the issue very promptly after it was disclosed. Sorry again if this caused you any frustration!

@indutny
Copy link

indutny commented Apr 10, 2019

As a matter of fact, I'm a maintainer of llhttp too 😂

@htuch
Copy link
Member

htuch commented Apr 10, 2019

@indutny we sent two e-mails, one pre and one post disclosure to security@nodejs.org. Happy to forward them again. Is HackerOne preferred? The title was "Vulnerability in http-parser & embedded NULL header handling".

Thanks again for the prompt fix once this went public, the comment above reflects a concern that we didn't have any effective method to coordinate prior to public disclosure.

@indutny
Copy link

indutny commented Apr 10, 2019

Oh no! security@nodejs.org should be forwarding reports to HackerOne. Apparently, something is not working as expected. Again, I'm terribly sorry for this. Will look into it momentarily.

@moderation
Copy link
Contributor

moderation commented Apr 10, 2019

@indutny Will https://hackerone.com/nodejs/hacktivity and https://groups.google.com/group/nodejs-sec be updated retroactively with the CVE?

@htuch it seems like the reporting system would have been or is broken for both http-parser and lhttp (independent of their health status as sub-projects).

@indutny
Copy link

indutny commented Apr 10, 2019

@moderation that's a good question. We can definitely request a CVE for the issue. Let me see what I can do.

@indutny
Copy link

indutny commented Apr 10, 2019

cc @rvagg @jasnell, perhaps?

@reedloden
Copy link

@indutny we sent two e-mails, one pre and one post disclosure to security@nodejs.org. Happy to forward them again. Is HackerOne preferred? The title was "Vulnerability in http-parser & embedded NULL header handling".

Hi, I work for @Hacker0x01 and act as our main liaison to the open source community. Hoping I can assist you.

When you e-mailed security@nodejs.org, you should have received an e-mail back from HackerOne stating "ACTION REQUIRED: Submit Vulnerability Report". Can you check to see if you received that? If so, you'll need to click that link in order to finalize your report submission.

In any case, you can always submit directly to https://hackerone.com/nodejs.

@htuch
Copy link
Member

htuch commented Apr 10, 2019

@reedloden as discussed in nodejs/security-wg#454, this was not received. I'm thinking that this is a process flaw if folks can opt to use e-mail or HackerOne and the e-mail path is not as reliable or requires steps that might result in a dropped report. Would it be better to just drop the e-mail option entirely and have folks use HackerOne directly to avoid potential lost reports?

@jasnell
Copy link

jasnell commented Apr 10, 2019

I haven't been as active in the security triage recently but I don't recall this coming through. My apologies for having missed it.

@htuch
Copy link
Member

htuch commented Nov 9, 2020

I've tagged this as both priority and security in response to the above comment.

@derekargueta
Copy link
Member

I have a "drop-in replacement" branch with everything passing except the new functionality that was introduced via http-parser for allowing transfer-encoding: chunked + content-length: N. There's an llhttp issue for this where I've started a small investigation and the fix seems pretty straightforward: nodejs/llhttp#69

@pallas
Copy link

pallas commented Nov 30, 2020

FYI, nodejs/llhttp#69 was just closed with a fix in v3.0.0

@moderation
Copy link
Contributor

I added a link to the release at the now stale WIP PR at #9033 (comment)

@derekargueta
Copy link
Member

Confirmed all tests pass with the new llhttp update 🎉 reviving the abstractions I initially had for supporting both parsers through a command line flag

@moderation
Copy link
Contributor

Woohoo

@asraa
Copy link
Contributor

asraa commented Jan 19, 2021

I ran some perf tests and got CPU profiles from Derek's branches (direct swap in with llhttp, and an added parser factory and parser class) here: https://gist.github.com/asraa/8babd3516acd447aa0f2ff5aa61d2e19

It doesn't seem to me that there's some overhead in adding the factory apparent above 100k RPS, and the llhttp swap in results in some CPU time differences that IMO is insignificant.

If there's anything I can do to make some of these benchmarks more informative please let me know!

@htuch
Copy link
Member

htuch commented Jan 20, 2021

@asraa looks compelling, i.e. there is little difference. I think the 1M QPS example probably has a ton of error rate reflecting open loop request drops; it might be a good idea when measuring max QPS achievable to switch NH to closed loop mode (at which point latency is not very interesting). I'd also be curious to see if there is any difference for large request, e.g. 1MB body.

@asraa
Copy link
Contributor

asraa commented Jan 22, 2021

Thanks! More benchmarks added.
Those benchmarks were actually closed loop, that appears the default in the nighthawk test client. I ran with --open-loop again though, and added those benchmarks. Again, didn't see a major difference in open loop latency.

You do see a larger error rate (OR llhttp improvement) in the open loop 500k QPS (1M was enough to break nighthawks stats gathering).

@asraa
Copy link
Contributor

asraa commented Jan 29, 2021

BTW -- added an adaptive load test (with nighthawk) to find optimal RPS (based on mean latency <= 5ms and 95%+ success rate)
@derekargueta

- BASELINE http_parser llhttp Factory with http_parser Factory with llhttp
Final RPS 20125 20156 19531 19843

the llhttp boost somewhat counteracts the factory, there's a 1% degradation.

@moderation
Copy link
Contributor

@moderation
Copy link
Contributor

https://nodejs.org/en/blog/release/v12.22.0/

The legacy HTTP parser is runtime deprecated

The legacy HTTP parser, selected by the --http-parser=legacy command line option, is deprecated with the pending End-of-Life of Node.js 10.x (where it is the only HTTP parser implementation provided) at the end of April 2021. It will now warn on use but otherwise continue to function and may be removed in a future Node.js 12.x release.

The default HTTP parser based on llhttp is not affected. By default it is stricter than the now deprecated legacy HTTP parser. If interoperability with HTTP implementations that send invalid HTTP headers is required, the HTTP parser can be started in a less secure mode with the --insecure-http-parser command line option.

@klarose
Copy link
Contributor

klarose commented Jun 23, 2021

@asraa Do you know offhand whether or not this new parser implementation will allow Envoy to support custom methods for http 1.1? I'm having to bypass the http layer of envoy in my cluster for a service whose protocol (over which I have no control) uses non-standard http methods. That is quite unfortunate.

I took a quick gander at the llhttp code, and it looked like it may have a similar restriction to the old parser (i.e. a see explicit lists of http methods e.g. https://github.com/nodejs/llhttp/blob/master/src/llhttp/c-headers.ts#L46), which worries me. I could be wrong, though.

@moderation
Copy link
Contributor

@kanongil
Copy link
Contributor

kanongil commented Nov 2, 2021

@klarose Last I checked llhttp hardcodes the same whitelisted HTTP methods.

@klarose
Copy link
Contributor

klarose commented Nov 2, 2021

@klarose Last I checked llhttp hardcodes the same whitelisted HTTP methods.

Thanks for looking in to it. That's unfortunate. Do you know if there is any way to change that behaviour? It's quite awful for us. We don't have control over the protocol involved here. Having to treat the request as TCP because the envoy http parser cannot handle unknown methods really removes a bunch of its benefits -- we lose the logging, filters, etc that all come with http. All of those things support arbitrary methods. Why not the parser?

There is some discussion here about the meaning of the methods, as well as a registry: https://datatracker.ietf.org/doc/html/rfc7231#section-4. It does say that a method which cannot be handled by a server should be rejected. However, envoy is not an http server; it is a proxy.

It seems to me that the server to which envoy is proxying should utltimately decide whether a method is unsupported.

Perhaps this isn't the proper forum for discussing this, but I feel like it should be taken into consideration when decision to use this parser.

Edit: It also looks like an earlier issue was submitted to envoy, and possibly left to close because of the thought that llhttp would fix the problem: #8247

@alyssawilk
Copy link
Contributor

Google has a similar problem where we need to support methods which http_parser does not handle. cc @yanavlasov maybe link your doc on this thread?

@narendrapatel
Copy link

narendrapatel commented Jan 31, 2022

We also have a requirement to support custom http methods. Support for this would be really helpful.

@alyssawilk
Copy link
Contributor

Closing in favor of #21245

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security no stalebot Disables stalebot from closing an issue priority/high tech debt
Projects
None yet