-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(streaming): use key encoding in SerializedKey #5596
Conversation
Codecov Report
@@ Coverage Diff @@
## main #5596 +/- ##
==========================================
+ Coverage 74.24% 74.32% +0.08%
==========================================
Files 915 907 -8
Lines 143288 143033 -255
==========================================
- Hits 106379 106308 -71
+ Misses 36909 36725 -184
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I'm afraid that requires key encoding aka. memcomparable. (I had wrote this in #5526) Key encoding ensures the equality checking ( |
src/common/src/hash/key.rs
Outdated
@@ -443,12 +456,12 @@ impl<'a> HashKeySerDe<'a> for StructRef<'a> { | |||
type S = Vec<u8>; | |||
|
|||
/// This should never be called | |||
fn serialize(self) -> Self::S { | |||
fn fixed_size_serialize(self) -> Self::S { | |||
todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe out of the scope of this PR, but this code is super anti-pattern 😇 As it said "this should never be called", then StructRef
should not implement HashKeySerDe
.
May need further investigation on how to refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's more, the trait bound D: HashKeySerDe
in HashKeySerializer::apend
and HashKeyDeserializer::deserialize
feels as if HashKeySerDe
is only encoding that should be used, hence the comment on HashKeySerDe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's open another issue after this PR being merged
Didn't realize that. I was just thinking about performance 😢 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
As title.
Because value encoding requires
DataType
in deserialization, indeserialize_to_builders
the types of the hash key needs to be known. I have two choices: pass in&[DataType]
from the caller or infer the type fromArrayBuilderImpl
's variant. I choose the former, because:Vec<DataType>
stored in the caller instead of creating new instances.ArrayBuilderImpl
's variant toDataType
involves more code change, including thefor_all_variant
macro and all related macros, and one more enum dispatching via macro, which makes the code harder to read.DataType
has some fields in the variant (such as list or struct, although they don't seem to be used), it's very troublesome to derive them fromArrayBuilderImpl
. We need to pass the completeDataType
into the builder only for the purpose of deserialization, which is not very necessary.Checklist
./risedev check
(or alias,./risedev c
)Documentation
If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.
Types of user-facing changes
Please keep the types that apply to your changes, and remove those that do not apply.
Release note
Please create a release note for your changes. In the release note, focus on the impact on users, and mention the environment or conditions where the impact may occur.
Refer to a related PR or issue link (optional)
Closes #5526