Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend ExtractUVarInt to support customizable kMaxVarintLen64 #1757

Conversation

ChinmayaSharma-hue
Copy link
Contributor

@ChinmayaSharma-hue ChinmayaSharma-hue commented Oct 31, 2023

ExtractUVarInt in Binary Decoder currently uses a definite value for kMaxVarintLen64, which can be a problem while parsing the variable encoding in protocols such as MQTT. which has the maximum limit set as 4 bytes. This PR makes a slight modification to support this functionality.

… Decoder

Signed-off-by: Chinmay <chinmaysharma1020@gmail.com>
@ddelnano
Copy link
Member

@ChinmayaSharma-hue thanks for opening this up. After thinking about this a second time, the UVarInt encoding is self delimiting. Meaning that the kMaxVarintLen64 is there to prevent erroneous, larger values. However, that condition isn't relevant for decoding smaller ones.

Rather than changing the implementation of this, I think it would be better if we added constants for the maximum value of a X bit encoded int to detect if we read too much data for the integer size we are interested in (since decoding could consume 10 bytes). If the MQTT parser consumed 5 to 10 bytes, it would indicate the MQTT packet was malformed or wasn't MQTT data at all.

Therefore, I'm thinking that we should define the following constants and that we can have it live within the binary_decoder.h file. We could probably do that all in the #1756 or we can update this to have that constant.

// This is the constant that the MQTT parser would check the UVarInt decoded value against.
// MQTT uses all 7 bits of each of the 4 bytes it encodes, which results in a total of 28 bits
uint64_t kMaxVarint28 = 268435455 // 2^28 - 1;


// mqtt/parser.cc code

// Need to verify length is at least 4 prior to this because too few bytes could be kNeedsMoreData
// (if buf.size() < 4) but would be kInvalid if the UVarInt was actually 5-10 bytes (bogus value)

PX_ASSIGN_OR(auto var_length_header,  binary_decoder->ExtractUVarInt(), return kInvalid);
if (var_length_header > kMaxVarint28) {
  return kInvalid; // MQTT variable length header too large. Frame is malformed or not MQTT
}

Sorry for going back and forth on this, but I had to refresh my memory of the encoding details. I still went through and reviewed the code since I figured those comments would be applicable for your upcoming code changes.

@ChinmayaSharma-hue
Copy link
Contributor Author

ChinmayaSharma-hue commented Oct 31, 2023

This means that I can close this PR and make all the changes in #1756 right? The implementation of ExtractUVarInt remains the same and I should just add a check in the parser to make sure ExtractUVarInt did not consume more than 4 bytes. Am I getting this right?

@ddelnano
Copy link
Member

@ChinmayaSharma-hue yep.

@ChinmayaSharma-hue ChinmayaSharma-hue deleted the update-extractuvarint branch October 31, 2023 16:07
@@ -75,7 +74,7 @@ class BinaryDecoder {
// Extract UVarInt encoded value and return result as uint64_t. The details of this encoding's
// specification can be see in the following link:
// https://cs.opensource.google/go/go/+/refs/tags/go1.20.5:src/encoding/binary/varint.go;l=7-25
StatusOr<uint64_t> ExtractUVarInt() {
StatusOr<uint64_t> ExtractUVarInt(int kMaxVarintLen64 = 10, bool cap_final_byte_val = true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k prefixed variables with camelCase are constants. Since this is a function parameter, it should be snake case and not have a prefix.

We largely follow Google's C++ style guide as mentioned here.

@@ -92,6 +92,43 @@ TEST(BinaryDecoderTest, ExtractUVarInt) {
}
}

TEST(BinaryDecoderTest, ExtractUVarIntLimit) {
std::vector<std::tuple<std::vector<uint8_t>, uint64_t, uint64_t>> uVarIntsWithLimits = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uVarIntsWithLimits should be snake case according to our style guide.

ASSERT_OK_AND_EQ(bin_decoder.ExtractUVarInt(limit, false), expected_val);
}

std::vector<std::tuple<std::vector<uint8_t>, uint64_t>> uVarIntsWithLimitsOverflow = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same casing change as the earlier comment.

@ddelnano
Copy link
Member

Sorry I thought I submitted those comments with my initial comment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants