-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draft: feat: Resilient parsing #60
base: master
Are you sure you want to change the base?
Conversation
929a426
to
a5d1ba8
Compare
|
||
data Utf8Result | ||
= Utf8Success !TE.StrictBuilder | ||
| Utf8Failure !TE.UnicodeException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Could it be useful for Utf8Failure
to hold information about what has been successfully parsed so far?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, Utf8Result
isn't exported. It, and parseUtf8Incremental'
, are just used to implement parseUtf8Incremental
.
I don't currently see a use for the parsed prefix in parseUtf8Incremental
, but maybe you've seen one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the details well enough. Was just a thought that maybe including what was successfully parsed so far could be useful to help identify the method name.
Thanks for the PR, @414owen. I'm supportive of this change (although I haven't taken a review pass over it yet). |
This is an attempt to make pinch resilient against bad packets. To do this, I made parsing messages incremental, meaning that the parse can fail before all the packets of the message have been read into memory. I've made decoding the method name incremental, and added the option to decode via the spec verbatim, or to only accept method names that can be produced by the thrift IDL compiler.
This adds protocol limits as optional parameters to the binary and compact protocol smart constructors.
7e69ff1
to
1ac16a7
Compare
This is an attempt to make pinch resilient against bad packets.
To do this, I made parsing messages incremental, meaning that the parse
can fail before all the packets of the message have been read into
memory. This depends on a PR to cereal, which hasn't been merged yet.
I've made decoding the method name incremental. It will now short-circuit the parsing of a message, if the name isn't valid utf-8.
I've added the option to only accept method names that can be produced by the thrift IDL compiler. This is a much more restrictive requirement than we're given by the binary protocol spec, but what you have to remember is that the spec is largely reverse-engineered from implementations anyway:
From the spec:
I've added more comprehensive guards to the sizes of binary, map, set, and list sizes.