-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Consolidate udf examples (#18142) #18493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate udf examples (#18142) #18493
Conversation
High-Level OverviewThis PR consolidates all cargo run --example udf -- [adv_udaf|adv_udf|adv_udwf|async_udf|udaf|udf|udtf|udwf] |
xudong963
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I checked the original issue and the PR implements the proposal.
Could you please do some verification to see if the PR can reduce the binary size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cj-zhukov
@xudong963 per testing:
on this branch, the example binary is 174M
cargo run --profile=ci --example udf -- adv_udaf`$ du -h target/ci/examples/udf
174M target/ci/examples/udfOn main, the equivalent 6 binaries are:
cargo run --profile=ci --example advanced_udaf
cargo run --profile=ci --example simple_udaf
cargo run --profile=ci --example advanced_udf
cargo run --profile=ci --example simple_udf
cargo run --profile=ci --example advanced_udwf
cargo run --profile=ci --example simple_udwfAnd they each take 173M
$ du -s -h target/ci/examples/*
173M target/ci/examples/advanced_udaf
173M target/ci/examples/advanced_udf
173M target/ci/examples/advanced_udwf
173M target/ci/examples/simple_udaf
173M target/ci/examples/simple_udf
173M target/ci/examples/simple_udwfSo by my calculations this PR saves 173*6 - 124 = 914MB !
@cj-zhukov -- one thing I noticed is that by consolidating the examples, we are likely no longer actually execute them as part of CI:
datafusion/ci/scripts/rust_example.sh
Lines 31 to 35 in 32d2618
| example_name=`basename $filename ".rs"` | |
| # Skip tests that rely on external storage and flight | |
| if [ ! -d $filename ]; then | |
| cargo run --profile ci --example $example_name | |
| fi |
Could you make a new PR that also runs them all? Maybe something like add an ExampleKind::All type to the examples and then pass in all as an argument 🤔
|
@alamb @xudong963 Thanks a lot for the review and helpful feedback! I’m also planning to open a follow-up PR to update part of the CI in |
martin-g
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Got it -- thank you. I filed a ticket to track that work: |
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - part of #apache#18142. ## Rationale for this change This PR is for consolidating all the `udf` examples into a single example binary. We are agreed on the pattern and we can apply it to the remaining examples <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
Which issue does this PR close?
Rationale for this change
This PR is for consolidating all the
udfexamples into a single example binary. We are agreed on the pattern and we can apply it to the remaining examplesWhat changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?