Skip to content
This repository has been archived by the owner on Nov 6, 2022. It is now read-only.

http_parser_parse_url crash when random string passed #209

Closed
mobiphil opened this issue Jan 6, 2015 · 9 comments
Closed

http_parser_parse_url crash when random string passed #209

mobiphil opened this issue Jan 6, 2015 · 9 comments

Comments

@mobiphil
Copy link

mobiphil commented Jan 6, 2015

was expecting to get only path UF_PATH retrieved when calling http_parser_parse_url without any schema or "//", instead the function crashes the host app.

it should either fail to fill any of the fields or just fill in path with full string if no schema "//.." specified

@indutny
Copy link
Member

indutny commented Jan 7, 2015

@mobiphil thanks for sharing this with us. May I kindly ask you to provide a test case for our convenience?

@mobiphil
Copy link
Author

mobiphil commented Jan 7, 2015

hi, happy to help, just try:

struct http_parser_url u;
char *url = "test";

http_parser_parse_url(url, strlen(url), 0, &u) ;

the problem is that the processor assumes that there is still input available after the state machine changes from "schema" to next one

jay added a commit to jay/http-parser that referenced this issue Feb 23, 2015
(http_parse_host)
- Check for invalid parameters.
- Cast field offset assignment to proper type.

(http_parser_parse_url)
- Check for invalid parameters.
- Cast field offset assignment to proper type.
- Fail if the schema field was found but the host field was not found.
- Call http_parse_host only if the host field was found.

Prior to this change http_parse_host, which expects the UF_HOST field to
be filled, could be called even when that field was not filled resulting
in a crash. Refer to issue nodejs#209.
@jay
Copy link
Contributor

jay commented Feb 23, 2015

What is happening here is http_parse_host expects the UF_HOST field to be filled but it isn't. Refer to my PR for a fix.

In the case of test you'll see a parsing error and in the case of something like example.com/test you'll also see a parsing error. To parse those would mean to allow no schema. I know libcurl does this in its code but I'm not sure how it plays here. What the OP described will work though because a slash designates the path:

Parsing //, connect 0
Parse ok, result :
        field_set: 0x8, port: 0
        field_data[0]: unset
        field_data[1]: unset
        field_data[2]: unset
        field_data[3]: off: 0, len: 2, part: //
        field_data[4]: unset
        field_data[5]: unset
        field_data[6]: unset

Personally I want to be able to handle something like example.com/test so I will probably do something in my own fork to add a best-guess for schema if the url is missing it.

@mobiphil
Copy link
Author

Thanks for your time for dealing with the issue. Among others I was intending to use the url parser for parsing of the GET command from a http header that, as you may know, does not contain the schema or server. So for a request like http://gliaeuoaeu.com it would issue a http header with "GET / HTTP/1.1." For http://gliaoeuo.com///// will issue a http header with "GET ///// HTTP/1.1" which I think is valid http. I think the url parser should interpret thus the slashes as virtual path elements rather than the double slash from the schema.

@craig65535
Copy link

@mobiphil Why not build a real URL using the value of the Host header before passing it into http_parser_parse_url?

@mobiphil
Copy link
Author

Well, that was what I done, you can imagine. The bug was against the crash. The question is if doing what you advice is hackish or not. Without being perfectionist for sure it is not efficient to just concatenate strings because a library may not implement sthg. correctly. The question is what is the standard/advice described in protocol related documents.

@indutny
Copy link
Member

indutny commented Feb 24, 2015

Wait, hostname is not required. The crash is happening because the http_parser_url should be memset to all zeroes before the process. We require this for every other thing with various _init() functions, but forgot to document this.

Going to send a PR in a bit.

indutny added a commit to indutny/http-parser that referenced this issue Feb 24, 2015
The struct must be zero-initialized, but this wasn't explicitly stated
anywhere in headers. Introduce `http_parser_url_init` API method that
will do it.

Fix nodejs#209
@indutny
Copy link
Member

indutny commented Feb 24, 2015

Should be fixed by #225

@jasnell
Copy link
Member

jasnell commented Oct 26, 2015

Closing. Can reopen if #225 didn't actually solve it

@jasnell jasnell closed this as completed Oct 26, 2015
indutny added a commit that referenced this issue Oct 27, 2015
The struct must be zero-initialized, but this wasn't explicitly stated
anywhere in headers. Introduce `http_parser_url_init` API method that
will do it.

Fixes: #209
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Brian White <mscdex@mscdex.net>
PR-URL: #225
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants