Skip to content

Conversation

@cj-zhukov
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

This PR is for consolidating all the udf examples into a single example binary. We are agreed on the pattern and we can apply it to the remaining examples

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@cj-zhukov
Copy link
Contributor Author

High-Level Overview

This PR consolidates all udf (simple, advanced, async) examples into a single example binary.
Previously, each example had its own file, but now they can be executed via subcommands using:

cargo run --example udf -- [adv_udaf|adv_udf|adv_udwf|async_udf|udaf|udf|udtf|udwf]

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I checked the original issue and the PR implements the proposal.

Could you please do some verification to see if the PR can reduce the binary size?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cj-zhukov

@xudong963 per testing:

on this branch, the example binary is 174M

cargo run --profile=ci --example udf -- adv_udaf`
$ du -h target/ci/examples/udf
174M	target/ci/examples/udf

On main, the equivalent 6 binaries are:

cargo run --profile=ci --example advanced_udaf 
cargo run --profile=ci --example simple_udaf 
cargo run --profile=ci --example advanced_udf 
cargo run --profile=ci --example  simple_udf 
cargo run --profile=ci --example advanced_udwf 
cargo run --profile=ci --example simple_udwf

And they each take 173M

$ du -s -h target/ci/examples/*
173M	target/ci/examples/advanced_udaf
173M	target/ci/examples/advanced_udf
173M	target/ci/examples/advanced_udwf
173M	target/ci/examples/simple_udaf
173M	target/ci/examples/simple_udf
173M	target/ci/examples/simple_udwf

So by my calculations this PR saves 173*6 - 124 = 914MB !

@cj-zhukov -- one thing I noticed is that by consolidating the examples, we are likely no longer actually execute them as part of CI:

example_name=`basename $filename ".rs"`
# Skip tests that rely on external storage and flight
if [ ! -d $filename ]; then
cargo run --profile ci --example $example_name
fi

Could you make a new PR that also runs them all? Maybe something like add an ExampleKind::All type to the examples and then pass in all as an argument 🤔

@cj-zhukov
Copy link
Contributor Author

@alamb @xudong963 Thanks a lot for the review and helpful feedback!
I’ll continue with consolidating the remaining examples next.

I’m also planning to open a follow-up PR to update part of the CI in rust_example.sh - this was originally pointed out by @Jefffrey in this comment

Copy link
Member

@martin-g martin-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@alamb
Copy link
Contributor

alamb commented Nov 6, 2025

@alamb @xudong963 Thanks a lot for the review and helpful feedback! I’ll continue with consolidating the remaining examples next.

I’m also planning to open a follow-up PR to update part of the CI in rust_example.sh - this was originally pointed out by @Jefffrey in this comment

Got it -- thank you. I filed a ticket to track that work:

@alamb alamb added this pull request to the merge queue Nov 6, 2025
Merged via the queue into apache:main with commit a5eb912 Nov 6, 2025
28 checks passed
@cj-zhukov cj-zhukov deleted the cj-zhukov/consolidate-examples-udf branch November 6, 2025 12:38
codetyri0n pushed a commit to codetyri0n/datafusion that referenced this pull request Nov 11, 2025
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- part of #apache#18142.

## Rationale for this change
This PR is for consolidating all the `udf` examples into a single
example binary. We are agreed on the pattern and we can apply it to the
remaining examples
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants