Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement serialization for ScalarValue::FixedSizeBinary #3943

Merged
merged 8 commits into from
Oct 26, 2022
Merged

Implement serialization for ScalarValue::FixedSizeBinary #3943

merged 8 commits into from
Oct 26, 2022

Conversation

retikulum
Copy link
Contributor

Which issue does this PR close?

Closes #3928.

Rationale for this change

Implement serialization for ScalarValue::FixedSizeBinary for communicating with protobuf.

I added ScalarFixedSizeBinary as message in .proto file. I am not sure about if it is the correct way.

My other concern is one of the test is ScalarValue::FixedSizeBinary(0, None) but what is the length of None?

@@ -760,6 +760,11 @@ message StructValue {
repeated Field fields = 3;
}

message ScalarFixedSizeBinary{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: should we follow the actual definition and put the length as the first field (it shouldn't matter much, except aesthetically but I think it keeps the declaration in here and the struct in scalar value consistent).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for feedback. While I was looking for similar definitons, I came acrros to this and implement it in this way. https://github.com/apache/arrow-datafusion/blob/48f73c6af3b0cc747c38b4a9c7a610f4630e8736/datafusion/proto/proto/datafusion.proto#L711-L714

However, I am open to fix in a way to comply with preferred code style. If this is the only thing that needs to be changed, I will create a commit.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @retikulum and @isidentical
I think this PR is looking good -- it just needs a rebase or merge and resolve the conflicts and it should be good to go

@@ -834,6 +840,7 @@ enum PrimitiveScalarType{

BINARY = 25;
LARGE_BINARY = 26;
FIXED_SIZE_BINARY = 29;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the duplication with PrimitiveScalarType and ScalarValue in 54d2870 so when you are resolving the merge conflict you should be able to just remove this code

datafusion/proto/src/lib.rs Show resolved Hide resolved
return Err(Error::General(
"FixedSizeBinary is not yet implemented".to_owned(),
));
scalar::ScalarValue::FixedSizeBinary(length, val) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@retikulum
Copy link
Contributor Author

Thanks @alamb for review and sorry for lots of commits. These conflicts are great for improving my git/rust skills :)

@alamb
Copy link
Contributor

alamb commented Oct 26, 2022

Thanks @alamb for review and sorry for lots of commits. These conflicts are great for improving my git/rust skills :)

Thank you for putting up with it. It takes some getting used to but clearly you are figuring it out.

@alamb alamb merged commit 4e29835 into apache:master Oct 26, 2022
@retikulum retikulum deleted the 3928_serialization_fixedsizebinary branch October 26, 2022 16:43
@ursabot
Copy link

ursabot commented Oct 26, 2022

Benchmark runs are scheduled for baseline = 97ff345 and contender = 4e29835. 4e29835 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this pull request Nov 5, 2022
* Implement serialization for ScalarValue::FixedSizeBinary

* correct test

* fix formatting

* add none test case for non zero length

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* resolve conflict

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement serialization for ScalarValue::FixedSizeBinary
4 participants