-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault when parsing with TOML_EXCEPTIONS=0
#65
Comments
Interesting. The test suite does run builds with I'll investigate on my machine all the same, but every bit helps. |
Oh, you know what, scratch that! I see what the issue is immediately. The core parse loop doesn't account for the possibility of a comment or malformed UTF-8 right at EOF. Should be easy to fix, thanks. |
Another input resulting in a null pointer dereference: I should mentioned that this happens after applying the 6255dd7 fix. |
Fixed, thanks @sneves! Since you seem to enjoy digging a little bit, I'll explain what was actually happening here.
Adding a single Worth noting that all this complexity evaporates if the library is built with exceptions, so if you can use them I highly recommend it (especially since all the explicit error-handling code dissapearing means the parser is a decent bit faster). There's now a unit test to catch this, too: tomlplusplus/tests/user_feedback.cpp Lines 91 to 102 in 6255dd7
|
Yup, that will be the same issue. |
Oh 😓 |
These are pretty weird inputs @sneves; out of curiosity, where are you getting them from? Are you just making them up to stress-test the library (which I appreciate!), or is there some example garbage TOML you're plucking it from? |
Alright, should be good now. I also audited the rest of the This is good stuff, keep it coming! |
These are inputs from fuzzing; they're not "natural" inputs. |
Ah, makes sense. Fuzzing has been on my to-do list for a while; next time I get a batch of more serious |
Against the latest I managed to fix it by adding |
Ah, thanks. I'm heading out for the night but will have a look into this tomorrow (that and any other broken inputs you find). |
A couple more cases:
- assert_not_error();
+ return_if_error({});
@@ -9911,10 +9912,6 @@ TOML_IMPL_NAMESPACE_START
else if (!is_match(*prev, U'e', U'E'))
set_error_and_return_default("expected exponent digit, saw '"sv, to_sv(*cp), "'"sv);
}
- else if (length == sizeof(chars))
- set_error_and_return_default(
- "exceeds maximum length of "sv, static_cast<uint64_t>(sizeof(chars)), " characters"sv
- );
else if (is_decimal_digit(*cp))
{
if (!seen_decimal)
@@ -9928,6 +9925,11 @@ TOML_IMPL_NAMESPACE_START
else
set_error_and_return_default("expected decimal digit, saw '"sv, to_sv(*cp), "'"sv);
+ if (length == sizeof(chars))
+ set_error_and_return_default(
+ "exceeds maximum length of "sv, static_cast<uint64_t>(sizeof(chars)), " characters"sv
+ );
+
chars[length++] = static_cast<char>(cp->bytes[0]);
prev = cp;
advance_and_return_if_error({}); Other than this, nothing's popped up for quite a while, so I think the low-hanging fruit could be over.. PS: As before, I can't swear by the correctness of these fixes, only that they made the immediate problem go away. |
Fantastic, thanks @sneves! I've applied fixes for these in master.
Your fixes are sound. The parser is deliberately procedural and "C-like" so there's not a lot of hidden voodoo that goes into fixes and modifications. |
A very simple case, once again: parsing
"#\xf1\x63"
when disabling exceptions. The crash occurs inparser::parse_document
when trying to continue parsing afterreader.read_next()
returnsnullptr
.I'm not entirely sure why it happens; it looks like
consume_comment
should returnfalse
onadvance_and_return_if_error({})
, but somehow it does returntrue
instead.The text was updated successfully, but these errors were encountered: