Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: construct StructArray w/ FieldRef #4116

Merged
merged 1 commit into from
Apr 26, 2023

Conversation

crepererum
Copy link
Contributor

Which issue does this PR close?

-

Rationale for this change

DataType uses Fields/FieldRef internally. Accepting Field just to wrap it into an Arc is unnecessary expensive, esp. when the Field was cloned from an FieldRef (happens in some non-test code).

I've decided to NOT allow the construction from Field anymore because in prod code this is most likely a performance bug.

What changes are included in this PR?

Change From impls for StructArray to use FieldRef instead of Field.

Are there any user-facing changes?

Breaking: Constructor changes.

@github-actions github-actions bot added arrow Changes to the arrow crate parquet Changes to the parquet crate labels Apr 24, 2023
@crepererum crepererum force-pushed the crepererum/structarray_fieldref branch 4 times, most recently from a21a67b to 182506c Compare April 24, 2023 13:36
@tustvold
Copy link
Contributor

FYI - #4064

@crepererum
Copy link
Contributor Author

FYI - #4064

I would take that one as well, I don't really care.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Thank you @crepererum

@alamb
Copy link
Contributor

alamb commented Apr 24, 2023

This PR appears to need a cargo fmt to fix the CI

@alamb alamb added the api-change Changes to the arrow API label Apr 24, 2023
@crepererum crepererum force-pushed the crepererum/structarray_fieldref branch from 182506c to 6dd2d27 Compare April 24, 2023 15:10
@alamb
Copy link
Contributor

alamb commented Apr 24, 2023

I think #4064 just needs some tests -- shall we combine efforts?

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to confess to being a bit torn, on the one hand this PR improves the existing StructArray::from API, but on the other-hand these APIs are rather cumbersome and part of me sort of hoped to remove them. I personally think the approach in #4064 is cleaner, more easily discoverable / consistent and is also faster, but I appreciate I am not impartial here.

I'm therefore left wondering if it is better to just encourage people to use the constructor approach, instead of refining what is a somewhat unfortunate API? Appreciate other thoughts on this...

@alamb
Copy link
Contributor

alamb commented Apr 24, 2023

I'm therefore left wondering if it is better to just encourage people to use the constructor approach, instead of refining what is a somewhat unfortunate API? Appreciate other thoughts on this...

If we are going for this I recommend we deprecate the unfortunate API

@alamb
Copy link
Contributor

alamb commented Apr 24, 2023

And we could always do both 🤔

@crepererum
Copy link
Contributor Author

I can also push #4064 over the finish line later this week and close this PR. :)

`DataType` uses `Fields`/`FieldRef` internally. Accepting `Field` just
to wrap it into an `Arc` is unnecessary expensive, esp. when the `Field`
was cloned from an `FieldRef` (happens in some non-test code).

I've decided to NOT allow the construction from `Field` anymore because
in prod code this is most likely a performance bug.
@crepererum crepererum force-pushed the crepererum/structarray_fieldref branch from 6dd2d27 to fa3d4f8 Compare April 26, 2023 09:08
@crepererum
Copy link
Contributor Author

I think this is still worth merging (or we delete the From constructors) since it actually fixes some non-test code.

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote would be for removing them, but I guess this is also fine, it is a shame there is no way to deprecate a trait implementation

@alamb alamb merged commit 9cf48c1 into apache:master Apr 26, 2023
@alamb
Copy link
Contributor

alamb commented Apr 26, 2023

How about we propose removing them in a follow on PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API arrow Changes to the arrow crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants