-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack-overflow (OSS-Fuzz 4234) #832
Comments
I analyzed the testcase. It more or less consists of a lot of nested arrays. On my machine, there is stack overflow at depth 1450. I am not sure how to cope with such issues. We have the same problem if we would parse a JSON file like [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[ ... At some point, the parser would explode. Any ideas on this? |
The only ways to "fix" this are to impose an arbitrary limit on the level of nesting, or change whatever it is that is overflowing the stack from a recursive algorithm to an iterative algorithm. |
Yes, and imposing such a limit would be OK with RFC 7159 (Sect. 9):
But I would not like it... The recursion is: parse_cbor_internal -> get_cbor_array -> parse_cbor_internal -> get_cbor_array -> ... |
According to the stack above, it's actually a 4 function recursion, which I think would be extremely difficult to break:
|
Sorry, I simplified it a bit. And it all makes sense, when you try to parse a recursive structure with a recursive decent parser... |
I think there is nothing we can do about this. JSON and CBOR base on recursion, and adding arbitrary limits is not a solution. |
Strangely, I got this message yesterday:
|
Hi, Could I ask you to reconsider making changes to address this issue? My arguments are below :) While I agree with you that an arbitrary limit is not a proper solution, leaving it as is requires the library user to look for workarounds like:
A potential solution (which I thinking might be acceptable) is actually giving the library user a choice, instead of setting an arbitrary limit. For example, there could be a way for the library user to say "I want at most 10 levels of recursion (since I know that in my system there will never be any more than that". In this case the default could be "no limit". Or, to make thing stable/secure by default, it can probably be assumed (citation needed) that in the wild there 99.9% of packets have a recursion level of e.g. 50 (which is probably a huge exaggeration already, while still should be sane with regards to stack usage), and if the library user needs more, they could specify a different limit (or no limit). Comments / thoughts are welcomed :) Cheers, |
You mean something like a preprocessor macro |
Hi Niels, Originally I was thinking of a preprocessor macro for the default and a run-time parameter as default overwrite, e.g.:
I'm not fully convinced of adding another default parameter to That said, on one hand having a runtime method to overwrite the recursion limit might make some users happy (ones who, like me, like to set the limit as low as possible, probably on a per-payload-scheme basis). On the other hand I recognize that probably just having the compilation-time setting might be enough. Cheers, |
Hello! getrlimit(RLIMIT_STACK, struct rlimit *);
pthread_getattr_np(pthread_t thread, pthread_attr_t *attr); // on linux
void WINAPI GetCurrentThreadStackLimits(
_Out_ PULONG_PTR LowLimit,
_Out_ PULONG_PTR HighLimit
); // on windows then, regularly measure the remaining stack size (declare a variable on the stack, take its address and substract the value of its cast to (void *) from the value of the top stack address ), and stop parsing when the stack goes under some threshold? |
@JoelStienlet
Also, doing it dynamically adds some latency and allows untrusted payloads to be processed longer than required. Still, I like the idea from a technical perspective - a pretty neat trick :) |
Indeed limiting the maximum amount of recursions is simpler, more portable and faster to dismiss bad packets. |
To progress a little the discussion on this issue, are there any argument against introducing the limits as described? |
I currently have not found the time to propose an implementation. That said, PRs are welcome :) |
Ah, sorry Niels, didn't mean to pressure you :). I was under the impression you wanted to discuss the issue more ("state: please discuss"), so wanted to make sure we progress the discussion. |
Of course I would like to have more discussion on this - but it seems no one else wants to join in. |
Well if you wanted it to be 'unlimited depth' you could always throw to a trampoline on occasion anyway, if it is not used then it would have no running cost, and if it is used, well, it's better then stack overflowing... |
I think this would be much easier combined with some of the previous discussions about creating a policy class to hold the templates rather than having all of the template parameters. The only other ways I see to do it are adding another parameter to |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Not stale, just awaiting its turn :) |
FYI: The JSON parser in #971 is now non-recursive. It uses a |
So basically there are some options on this, once we have the SAX parser merged:
Option 1 is already possible. Option 2(i) is another argument to overwork the parser function interface (see also #971), because the parameters are piling up there. Maybe it is really time to put all parsing parameters into a policy class and only pass an input adapter and the policy to the parser. I think this is what @gregmarr proposed earlier. Option 2(ii) is much simpler, but would then required re-compilation if the depth should be adjusted. I would be happy to discuss this. I am removing the "wontfix" badge, because this should be addressed in the next feature release. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This concrete issue is fixed with #971. |
As discussed in #1419 (comment), the recursive model the of JSON values can lead to stackoverflow during deconstruction of deeply nested objects. These are the options:
Personally, I prefer option (4) that doesn't impose any limitations to the user and doesn't introduce any settings (e.g. max_depth) that user should be concerned. Regarding the stack overflow during deconstruction of deeply nested objects (#1419), I believe we can iteratively free up our JSON object tree and don't rely on automatic deconstruction. I have done some experiments as a proof of concept and the results seems promising. To continue, I wanted to know if the |
To quickly answer your question: Yes, There has been some discussions earlier to separate the implementation of the values (using |
clusterfuzz-testcase-minimized-5854339613589504.dms.zip
Stack trace:
The text was updated successfully, but these errors were encountered: