-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trial parsing is not the most efficient #474
Comments
MoQ's message encoding is similar to QUIC's, but the difference is that QUIC datagrams have a finite size and are not susceptible to a trickle attack. I've implemented two types of MoQ parsers:
The async parser is easy in some languages (JavaScript, Go). There could be excessive wake ups if the stream is written one byte at a time but I that's a fundamental QUIC problem (userspace wakeup). My trial parser (Rust) returns a You could implement a state machine manually too, but a trial parser will be fine if you don't allocate on the heap. I'm on the fence if we should length prefix or not. I've implemented both and think streaming is easier in general across multiple languages, but I do think this decoding gotcha is real. |
I wrote a trial parser in C that also returns the min number of additional bytes needed, like your Rust parser, but there can be a big difference between the min number and the max number: for a varint, 1 byte versus 8 bytes; for a TLV, 2 bytes for 2 varints versus 8+8+length. That's how I came to the estimate O(number_of_components^2). I am also on the fence, as I said in the top post. The simplest mitigation would be to limit the number of parameters allowed in Setup or Subscribe parameters. Currently, the lists contain one or two values, but given greasing the max is effectively unlimited. |
This sounds like what HTTP/3 frame and stream parsing already has to deal with, or am I overlooking something unique to MoQT? |
I would like to put a limit on number of parameters per version. For the draft-04 it is 2. If in the future drafts we add more, we can increase the max as needed. Any one using more than max for a given version will result in protocol error. |
H3 has an upfront frame length, so you can avoid dribbling byte/reparsing attacks |
OK, I think this is a real problem then. +1 to fixing it |
+1 on we should fix this. My proposal would be a that message other than object have a one new length field at start of message and object have a length of all the headers not including the object data plus just before the object data the length of the object payload |
A size prefix can be annoying for encoding because you need* to buffer messages to compute their total size. In TLS it's okay since the size is a uint32 and you can populate it afterwards. For MoQ it's more annoying since the size itself is a VarInt and takes up 1-8 bytes. I think a size prefix for each message is fine so long as we don't nest messages like TLS. That means no upfront size for params, although a count is fine. Or we can save a byte and have params continue until the end of a message. |
For control messages on bidi streams, we could just use fixed-size 32-bit length prefix, e.g.:
Object messages shouldn't really be called messages anyways, those are effectively data stream headers. |
I think we keep it a VarInt. If an encoder wants to use a fixed size length, it can use a 30-bit varint (first two bits are I would instead allocate/reuse a buffer for each message with 8 bytes at the start, skipping the first N bytes based on the final varint size. It's a little more complicated but worth the savings since most of these messages will need a 1 byte size anyway. I just really want to avoid nested messages like MP4 and TLS, where the majority of the bytes used are 32-bit id/size pairs. It's both a waste and annoying to encode |
+1 luke |
I believe this was fixed by #520 |
The MoQ transport messages are encoded as a combination of varints, bit strings, and t-l-v objects. In the absence of a "header length" field, applications have to resort to trial parsing: accumulate some incoming bytes from the stream, try to parse those bytes as a message, and if that fail accumulate more bytes and retry. This may not be a huge problem in normal operation. The peer endpoint will compose a message and typically send those bytes all at once; the QUIC stack will deliver them all at once, but even so there may be cases where the QUIC stack cannot send them in a single STREAM_DATA messages, so the logic for "accumulate and try" has to be present.
Suppose a peer sending its data one byte at a time, because it is either malicious or buggy. The length of the message cannot be computed before all components are parsed: the first byte of each varint defines its length, the length field of bits or bytes strings need to be parsed in full, then number of parameters or versions needs to be known. To obtain the length of a component, all components before it need to be parsed, which means that the number of trial parsing is typically a function of the number of components -- maybe an O(N2) function since the cost of each trial depends on the number of components before it. That number can be very large for messages including parameters, because the number of parameters cannot be known in advance -- the peer could send an arbitrary number of them in an attempt to "greasing". Parsing is a somewhat CPU intensive operation, which implies that trial parsing can be exploited for a mild DOS attack.
Prepend a length field to messages would mitigate trial parsing, but I am not sure that we need to fix this. Having limits on message size would impose a de-facto limit on the processing cost of each message. A malicious attackers could send series of short messages such as "subscribe updates" to trigger the same CPU consumption. In typical QUIC stacks, the main CPU load comes form passing messages through the OS/app boundary in socket calls, and an attacker could DOS the target by sending series of short messages. The next component is the cost of encryption and decryption, compared to which the cost of trial decryption is probably small. We may decide to do nothing, but it would be nice in that case to have a documentation of the reasoning (as in this issue), and maybe a discussion of why the attack does not matter in the security section.
Addressing the "limits" issue (#473) would also somewhat mitigate the problem, as limiting the "number of parameters" appears to be the main way to increase the number of components in messages.
The text was updated successfully, but these errors were encountered: