forked from apache/datafusion
-
Notifications
You must be signed in to change notification settings - Fork 0
Update fork #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Update fork #12
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…sed (apache#17958) * fix: Prune partitions when no filters are defined * fix: Formatting * chore: Cargo fmt * chore: Clippy
…pache#17906) * Check-in NestedLoopJoinProjectionPushDown * Update Cargo.lock * Add some comments * Update slts that are affected by the nl-join-projection-push-down * please lints * Move code into projection_pushdown.rs * Remove explicit coalesce batches * Docs
* chore: Extend backtrace coverage * fmt * part2 * feedback * clippy * feat: support Spark `concat` * clippy * comments * test * doc
* Add independent configs for topk/join dynamic filter * fix ci * update doc * fix typo
- Adds the ability for a user to choose a summary only output for an instrumented object store when using the CLI - The existing "enabled" setting that displays both a summary and a detailed usage for each object store call has been renamed to `Trace` to improve clarity - Adds additional test cases for summary only and modifies existing tests to use trace - Updates user guide docs to reflect the CLI flag and command line changes
* fix: Improve null handling in array_to_string function * chore
## Which issue does this PR close? - Closes apache#18084 ## Rationale for this change Some of the extended tests are failing because we have fixed case conditional evaluation and queries that (incorrectly) previously did not pass are now. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? Update datafusion-testing pin ## Are these changes tested? I tested locally with: ```shell INCLUDE_SQLITE=true cargo test --profile release-nonlto --test sqllogictests ``` ## Are there any user-facing changes? No
…che#18094) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.29 to 2.62.31. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.62.31</h2> <ul> <li> <p>Update <code>protoc@latest</code> to 3.33.0.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.3.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.34.1.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.9.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.6.0.</p> </li> </ul> <h2>2.62.30</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.18.6.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.15.2.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <h2>[2.62.31] - 2025-10-16</h2> <ul> <li> <p>Update <code>protoc@latest</code> to 3.33.0.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.3.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.34.1.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.9.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.6.0.</p> </li> </ul> <h2>[2.62.30] - 2025-10-15</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.18.6.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.15.2.</p> </li> </ul> <h2>[2.62.29] - 2025-10-14</h2> <ul> <li> <p>Update <code>zizmor@latest</code> to 1.15.1.</p> </li> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.106.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.8.</p> </li> <li> <p>Update <code>ubi@latest</code> to 0.8.1.</p> </li> </ul> <h2>[2.62.28] - 2025-10-11</h2> <ul> <li> <p>Update <code>release-plz@latest</code> to 0.3.148.</p> </li> <li> <p>Update <code>cargo-sort@latest</code> to 2.0.2.</p> </li> <li> <p>Update <code>cargo-binstall@latest</code> to 1.15.7.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.2.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/0005e0116e92d8489d8d96fbff83f061c79ba95a"><code>0005e01</code></a> Release 2.62.31</li> <li><a href="https://github.com/taiki-e/install-action/commit/6936d999d90424ed013e4f325d91e14d7ddba27f"><code>6936d99</code></a> Update <code>protoc@latest</code> to 3.33.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/ac7ad6efa1b1bb919bcaa357eb1873f328ee07f7"><code>ac7ad6e</code></a> Update <code>uv@latest</code> to 0.9.3</li> <li><a href="https://github.com/taiki-e/install-action/commit/005833aaf18c1621513995406c3bc0397747afc2"><code>005833a</code></a> Update <code>syft@latest</code> to 1.34.1</li> <li><a href="https://github.com/taiki-e/install-action/commit/2b32ff6f3dc99bc9fa6647cbc9f7da71cf979b65"><code>2b32ff6</code></a> Update <code>mise@latest</code> to 2025.10.9</li> <li><a href="https://github.com/taiki-e/install-action/commit/74c0274864f156f487aee04623a20b315fb2125a"><code>74c0274</code></a> Update <code>cargo-shear@latest</code> to 1.6.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/f13d8e15c52b25c79b608d399cc802adc73d83da"><code>f13d8e1</code></a> Release 2.62.30</li> <li><a href="https://github.com/taiki-e/install-action/commit/1034dc55996706645239db97d3ea04f42a708f22"><code>1034dc5</code></a> Update <code>vacuum@latest</code> to 0.18.6</li> <li><a href="https://github.com/taiki-e/install-action/commit/55b5d509b8761e9696e1cfec0d6f66f0655e8fff"><code>55b5d50</code></a> Update <code>zizmor@latest</code> to 1.15.2</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/5b5de1b4da26ad411330c0454bdd72929bfcbeb2...0005e0116e92d8489d8d96fbff83f061c79ba95a">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes #. Related to apache#18084 ## Rationale for this change Run extended suite on PRs for critical areas, to avoid post merge bugfixing <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? - Closes apache#18042 ## Rationale for this change This PR introduces a new dialect enum to improve type safety and code maintainability when handling different SQL dialects in DataFusion 1. Provide compile-time guarantees for dialect handling 2. Improve code readability and self-documentation 3. Enable better IDE support and autocomplete ## What changes are included in this PR? - Added a new `Dialect` enum to represent supported SQL dialects - Refactored existing code to use the new enum instead of previous representations - Modified tests to work with the new enum-based approach ## Are these changes tested? Yes ## Are there any user-facing changes? Yes, this is an API change: the type of the `dialect` field changed from `String` to `Dialect`
## Which issue does this PR close? - Closes apache#17982 ## Rationale for this change By making `NVLFunc` a wrapper for `CoalesceFunc` with a more restrictive signature the implementation automatically benefits from any optimisation work related to `coalesce`. ## What changes are included in this PR? - Make `NVLFunc` a thin wrapper of `CoalesceFunc`. This seemed like the simplest way to reuse the coalesce logic, but keep the stricter signature of `nvl`. - Add `ScalarUDF::conditional_arguments` as a more precise complement to `ScalarUDF::short_circuits`. By letting each function expose which arguments are eager and which are lazy, we provide more precise information to the optimizer which may enable better optimisation. ## Are these changes tested? Assumed to be covered by sql logic tests. Unit tests for the custom implementation were removed since those are no longer relevant. ## Are there any user-facing changes? The rewriting of `nvl` to `case when ... then ... else ... end` is visible in the physical query plan. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes #. Followup on apache#18063 (review) ## Rationale for this change Use cheaper `NullBuffer::union` to apply null mask instead of iterator approach <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
…unctions in proto (apache#18024) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#17417. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> - Support `null_treatment`, `distinct`, and `filter` for window function in proto. - Support `null_treatment` for aggregate udf in proto. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - [x] Add `null_treatment`, `distinct`, `filter` fields to `WindowExprNode` message and handle them in `to/from_proto.rs`. - [x] Add `null_treatment` field to `AggregateUDFExprNode` message and handle them in `to/from_proto.rs`. - [ ] Docs update: I'm not sure where to add docs as declared in the issue description. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> - Add tests to `roundtrip_window` for respectnulls, ignorenulls, distinct, filter. - Add tests to `roundtrip_aggregate_udf` for respectnulls, ignorenulls. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> N/A --------- Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
## Summary Adds exact `percentile_cont` aggregate function as the counterpart to the existing `approx_percentile_cont` function. ## What changes were made? ### New Implementation - Created `percentile_cont.rs` with full implementation - `PercentileCont` struct implementing `AggregateUDFImpl` - `PercentileContAccumulator` for standard aggregation - `DistinctPercentileContAccumulator` for DISTINCT mode - `PercentileContGroupsAccumulator` for efficient grouped aggregation - `calculate_percentile` function with linear interpolation ### Features - **Exact calculation**: Stores all values in memory for precise results - **WITHIN GROUP syntax**: Supports `WITHIN GROUP (ORDER BY ...)` - **Interpolation**: Uses linear interpolation between values - **All numeric types**: Works with integers, floats, and decimals - **Ordered-set aggregate**: Properly marked as `is_ordered_set_aggregate()` - **GROUP BY support**: Efficient grouped aggregation via GroupsAccumulator ### Tests Added comprehensive tests in `aggregate.slt`: - Error conditions validation - Basic percentile calculations (0.0, 0.25, 0.5, 0.75, 1.0) - Comparison with `median` function - Ascending and descending order - GROUP BY aggregation - NULL handling - Edge cases (empty sets, single values) - Float interpolation - Various numeric data types ## Example Usage ```sql -- Basic usage with WITHIN GROUP syntax SELECT percentile_cont(0.75) WITHIN GROUP (ORDER BY column_name) FROM table_name; -- With GROUP BY SELECT category, percentile_cont(0.95) WITHIN GROUP (ORDER BY value) FROM sales GROUP BY category; -- Compare with median (percentile_cont(0.5) == median) SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY price) FROM products; ``` ## Performance Considerations Like `median`, this function stores all values in memory before computing results. For large datasets or when approximation is acceptable, use `approx_percentile_cont` instead. ## Related Issues Closes apache#6714 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>
…ed (apache#18110) Looks like apache#17988 accidentally reverted the bump from apache#18096
…che#18113) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.31 to 2.62.33. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.62.33</h2> <ul> <li>Update <code>mise@latest</code> to 2025.10.10.</li> </ul> <h2>2.62.32</h2> <ul> <li> <p>Update <code>syft@latest</code> to 1.34.2.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.18.7.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <h2>[2.62.33] - 2025-10-17</h2> <ul> <li>Update <code>mise@latest</code> to 2025.10.10.</li> </ul> <h2>[2.62.32] - 2025-10-16</h2> <ul> <li> <p>Update <code>syft@latest</code> to 1.34.2.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.18.7.</p> </li> </ul> <h2>[2.62.31] - 2025-10-16</h2> <ul> <li> <p>Update <code>protoc@latest</code> to 3.33.0.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.3.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.34.1.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.9.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.6.0.</p> </li> </ul> <h2>[2.62.30] - 2025-10-15</h2> <ul> <li> <p>Update <code>vacuum@latest</code> to 0.18.6.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.15.2.</p> </li> </ul> <h2>[2.62.29] - 2025-10-14</h2> <ul> <li> <p>Update <code>zizmor@latest</code> to 1.15.1.</p> </li> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.106.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.8.</p> </li> <li> <p>Update <code>ubi@latest</code> to 0.8.1.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/e43a5023a747770bfcb71ae048541a681714b951"><code>e43a502</code></a> Release 2.62.33</li> <li><a href="https://github.com/taiki-e/install-action/commit/2ae4258c3daeaf460c202b95aa4272c1f594d78e"><code>2ae4258</code></a> Update <code>mise@latest</code> to 2025.10.10</li> <li><a href="https://github.com/taiki-e/install-action/commit/e79914c740f0acf092c59adfa2a61d3d2266b6bf"><code>e79914c</code></a> Release 2.62.32</li> <li><a href="https://github.com/taiki-e/install-action/commit/40168eab5f259c94f094865825dbdefd1cf31bbf"><code>40168ea</code></a> Update <code>syft@latest</code> to 1.34.2</li> <li><a href="https://github.com/taiki-e/install-action/commit/6d89b16c494331f0cdbca002e68ea5ab4fa8e3f6"><code>6d89b16</code></a> Update <code>vacuum@latest</code> to 0.18.7</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/0005e0116e92d8489d8d96fbff83f061c79ba95a...e43a5023a747770bfcb71ae048541a681714b951">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Doesn't close an issue. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Hi we are hiop, a Serverless Data Logistic Platform. We use DataFusion as a core part of our backend engine, and it plays a crucial role in our data infrastructure. Our team members are passionate about the project and actively try contribute to its development (@dariocurr). We’d love to have Hiop listed among the Known Users to show our support and help the DataFusion community continue to grow. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Just adding hiop as known user ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
…8117) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#3695 - Closes apache#3797 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Was looking at above issues and I don't believe we skip the failed rules for any tests anymore (default for the config is also `false`), apart from this cleanup, so filing this PR so we can close the issues. Seems we only do in this `window.slt` test after this fix: https://github.com/apache/datafusion/blob/621a24978a7a9c6d2b27973d1853dbc8776a56b5/datafusion/sqllogictest/test_files/window.slt#L2587-L2611 Which seems intentional. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Remove unnecessary `skip_failed_rules` config. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Existing tests. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> No. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
Related apache#16324 apache#16617 almost there!
…pache#18080) ## Which issue does this PR close? - This addresses part of apache#17713 - Closes apache#14462 ## Rationale for this change In order to remove the `datafusion` core crate from `proto` as a dependency, we need to access `ListingTable` but it is within the `core` crate. There already exists a `datafusion-catalog-listing` which is bare and appears to be the place this should exist. ## What changes are included in this PR? Move `ListingTable` and some of its dependent structs over to the `datafusion-catalog-listing` crate. There is one dependency I wasn't able to remove from the `core` crate, which is inferring the listing table configuration options. That is because within this method it downcasts `Session` to `SessionState`. If a downstream user ever attempts to implement `Session` themselves, these methods also would not work. Because it would cause a circular dependency, we cannot also lift the method we need out of `SessionState` to `Session`. Instead I took the approach of splitting off the two methods that require `SessionState` as an extension trait for the listing table config. From the git diff this appears to be a large change (+1637/-1519) however the *vast* majority of that is copying the code from one file into another. I have added a comment on the significant change. ## Are these changes tested? Existing unit tests show no regression. This is just a code refactor. ## Are there any user-facing changes? Users may need to update their use paths.
…crate (apache#18082) ## Which issue does this PR close? - This addresses part of apache#17713 but it does not close it. ## Rationale for this change In order to remove `core` from `proto` crate, we need `ArrowFormat` to be available. Similar to the other datasource types (csv, avro, json, parquet) this splits the Arrow IPC file format into its own crate. ## What changes are included in this PR? This is a straight refactor. Code is merely moved around. The size of the diff is the additional files that are required (cargo.toml, readme.md, etc) ## Are these changes tested? Existing unit tests. ## Are there any user-facing changes? Users that include `ArrowSource` may need to update their include paths. For most, the reexports will cover this need.
## Which issue does this PR close? This does not fully close, but is an incremental building block component for: - apache#17207 The full context of how this code is likely to progress can be seen in the POC for this effort: - apache#17266 ## Rationale for this change Continued progress filling out the methods that are instrumented for the instrumented object store. ## What changes are included in this PR? - Adds instrumentation around basic list operations into the instrumented object store - Adds test cases for new code ## Are these changes tested? Yes. Example output: ```sql DataFusion CLI v50.2.0 > \object_store_profiling trace ObjectStore Profile mode set to Trace > CREATE EXTERNAL TABLE nyc_taxi_rides STORED AS PARQUET LOCATION 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet'; 0 row(s) fetched. Elapsed 2.679 seconds. Object Store Profiling Instrumented Object Store: instrument_mode: Trace, inner: AmazonS3(altinity-clickhouse-data) 2025-10-16T18:53:09.512970085+00:00 operation=List path=nyc_taxi_rides/data/tripdata_parquet Summaries: List count: 1 Instrumented Object Store: instrument_mode: Trace, inner: AmazonS3(altinity-clickhouse-data) 2025-10-16T18:53:09.929709943+00:00 operation=List path=nyc_taxi_rides/data/tripdata_parquet 2025-10-16T18:53:10.106757629+00:00 operation=List path=nyc_taxi_rides/data/tripdata_parquet 2025-10-16T18:53:10.220555058+00:00 operation=Get duration=0.230604s size=8 range: bytes=222192975-222192982 path=nyc_taxi_rides/data/tripdata_parquet/data-200901.parquet 2025-10-16T18:53:10.226399832+00:00 operation=Get duration=0.263826s size=8 range: bytes=233123927-233123934 path=nyc_taxi_rides/data/tripdata_parquet/data-201104.parquet 2025-10-16T18:53:10.226194195+00:00 operation=Get duration=0.269754s size=8 range: bytes=252843253-252843260 path=nyc_taxi_rides/data/tripdata_parquet/data-201103.parquet . . . 2025-10-16T18:53:11.928787014+00:00 operation=Get duration=0.072248s size=18278 range: bytes=201384109-201402386 path=nyc_taxi_rides/data/tripdata_parquet/data-201509.parquet 2025-10-16T18:53:11.933475464+00:00 operation=Get duration=0.068880s size=17175 range: bytes=195411804-195428978 path=nyc_taxi_rides/data/tripdata_parquet/data-201601.parquet 2025-10-16T18:53:11.949629591+00:00 operation=Get duration=0.065645s size=19872 range: bytes=214807880-214827751 path=nyc_taxi_rides/data/tripdata_parquet/data-201603.parquet Summaries: List count: 2 Get count: 288 duration min: 0.060930s duration max: 0.444601s duration avg: 0.133339s size min: 8 B size max: 44247 B size avg: 18870 B size sum: 5434702 B > ``` ## Are there any user-facing changes? No-ish ## cc @alamb
## Which issue does this PR close? ## Rationale for this change support shuffle udf ## What changes are included in this PR? support shuffle udf ## Are these changes tested? UT ## Are there any user-facing changes? No
## Which issue does this PR close? - Closes apache#18126. ## Rationale for this change It's a useful constructor for users manipulating logical plans where they know the schemas will match exactly. We already expose other constructors for Union and constructors for logical plans. ## What changes are included in this PR? Makes `Union::try_new` a public function. ## Are these changes tested? Seems unnecessary. ## Are there any user-facing changes? The function is now public. Not a breaking change, but going forward changes to it would breaking changes to users of the logical plan API.
## Which issue does this PR close? - Closes apache#17360. ## Rationale for this change in LogicalPlan::Filter unparsing, if there's a window expr, it should be converted to quailify. postgres must has an alias for derived table. otherwise it will complain: ``` ERROR: subquery in FROM must have an alias. ``` fixed this issue at the same time. ## What changes are included in this PR? If window expr is found, convert filter to quailify. ## Are these changes tested? UT ## Are there any user-facing changes? No --------- Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes #. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> `EXPLAIN ANALYZE` can be used for profiling and displays the results alongside the EXPLAIN plan. The issue is that it currently shows too many low-level details. It would provide a better user experience if only the most commonly used metrics were shown by default, with more detailed metrics available through specific configuration options. ### Example In `datafusion-cli`: ``` > CREATE EXTERNAL TABLE IF NOT EXISTS lineitem STORED AS parquet LOCATION '/Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem'; 0 row(s) fetched. Elapsed 0.000 seconds. explain analyze select * from lineitem where l_orderkey = 3000000; ``` The parquet reader includes a large number of low-level details: ``` metrics=[output_rows=19813, elapsed_compute=14ns, batches_split=0, bytes_scanned=2147308, file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=18, num_predicate_creation_errors=0, page_index_rows_matched=19813, page_index_rows_pruned=729088, predicate_cache_inner_records=0, predicate_cache_records=0, predicate_evaluation_errors=0, pushdown_rows_matched=0, pushdown_rows_pruned=0, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=1, row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=0, bloom_filter_eval_time=21.997µs, metadata_load_time=273.83µs, page_index_eval_time=29.915µs, row_pushdown_eval_time=42ns, statistics_eval_time=76.248µs, time_elapsed_opening=4.02146ms, time_elapsed_processing=24.787461ms, time_elapsed_scanning_total=24.17671ms, time_elapsed_scanning_until_data=23.103665ms] ``` I believe only a subset of it is commonly used, for example `output_rows`, `metadata_load_time`, and how many file/row-group/pages are pruned, and it would better to only display the most common ones by default. ### Existing `VERBOSE` keyword There is a existing verbose keyword in `EXPLAIN ANALYZE VERBOSE`, however it's turning on per-partition metrics instead of controlling detail level. I think it would be hard to mix this partition control and the detail level introduced in this PR, so they're separated: the following config will be used for detail level and the semantics of `EXPLAIN ANALYZE VERBOSE` keep unchanged. ### This PR: configurable explain analyze level 1. Introduced a new config option `datafusion.explain.analyze_level`. When set to `dev` (default value), all existing metrics will be shown. If set to `summary`, only `BaselineMetrics` will be displayed (i.e. `output_rows` and `elapsed_compute`). Note now we only include `BaselineMetrics` for simplicity, in the follow-up PRs we can figure out what's the commonly used metrics for each operator, and add them to `summary` analyze level, finally set the `summary` analyze level to default. 2. Add a `MetricType` field associated with `Metric` for detail level or potentially category in the future. For different configurations, a certain `MetricType` set will be shown accordingly. #### Demo ``` -- continuing the above example > set datafusion.explain.analyze_level = summary; 0 row(s) fetched. Elapsed 0.000 seconds. > explain analyze select * from lineitem where l_orderkey| plan_type | plan | +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, metrics=[output_rows=5, elapsed_compute=25.339µs] | | | FilterExec: l_orderkey@0 = 3000000, metrics=[output_rows=5, elapsed_compute=81.221µs] | | | DataSourceExec: file_groups={14 groups: [[Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-0.parquet:0..11525426], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-0.parquet:11525426..20311205, Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:0..2739647], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:2739647..14265073], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-1.parquet:14265073..20193593, Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-2.parquet:0..5596906], [Users/yongting/Code/datafusion/benchmarks/data/tpch_sf1/lineitem/part-2.parquet:5596906..17122332], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment], file_type=parquet, predicate=l_orderkey@0 = 3000000, pruning_predicate=l_orderkey_null_count@2 != row_count@3 AND l_orderkey_min@0 <= 3000000 AND 3000000 <= l_orderkey_max@1, required_guarantees=[l_orderkey in (3000000)], metrics=[output_rows=19813, elapsed_compute=14ns] | | | | +-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.025 seconds. ``` Only `BaselineMetrics` are shown. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 4. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> UT ## Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…e#18091) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> N/A ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> There's a few functions in `datafusion/expr-common/src/type_coercion/aggregates.rs` that are unused elsewhere in the codebase, likely a remnant before the refactor to UDF, so removing them. Some are still used (`coerce_avg_type()` and `avg_return_type()`) so these are inlined into the Avg aggregate function (similar to Sum). Also refactor some window functions to use already available macros. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - Remove some unused functions - Inline avg coerce & return type logic - Refactor Spark Avg a bit to remove unnecessary code - Refactor ntile & nth window functions to use available macros ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Existing tests. ## Are there any user-facing changes? Yes as these functions were publicly exported; however I'm not sure they were meant to be used by users anyway, given what they do. <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
apache#18273) ## Which issue does this PR close? Closes apache#18058 ## Rationale for this change When adding the bitmap_count function to Comet, we get the following error - org.apache.comet.CometNativeException: Error from DataFusion: bitmap_count expects Binary/BinaryView/FixedSizeBinary/LargeBinary as argument, got Dictionary(Int32, Binary). ## Are these changes tested? Added new UT --------- Co-authored-by: Kazantsev Maksim <mn.kazantsev@gmail.com>
…he#18287) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#18288 ## Rationale for this change `cargo audit` says that the current version of `half` we have in our Cargo.lock file was yanked ``` Crate: half Version: 2.7.0 Warning: yanked Dependency tree: half 2.7.0 ``` And indeed it is: https://crates.io/crates/half/versions <img width="1193" height="830" alt="Screenshot 2025-10-26 at 7 20 54 AM" src="https://github.com/user-attachments/assets/ad6944c6-912c-4c56-9d1d-efe760ae85ee" /> So let's update to a non yanked version ## What changes are included in this PR? run `cargo update -p half` and check the result in ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
…che#18293) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.36 to 2.62.38. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.62.38</h2> <ul> <li> <p>Update <code>coreutils@latest</code> to 0.3.0.</p> </li> <li> <p>Update <code>wasmtime@latest</code> to 38.0.3.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.17.</p> </li> <li> <p>Update <code>cargo-tarpaulin@latest</code> to 0.34.1.</p> </li> </ul> <h2>2.62.37</h2> <ul> <li> <p>Update <code>cargo-binstall@latest</code> to 1.15.8.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.16.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.16.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <ul> <li>Update <code>mise@latest</code> to 2025.10.18.</li> </ul> <h2>[2.62.38] - 2025-10-25</h2> <ul> <li> <p>Update <code>coreutils@latest</code> to 0.3.0.</p> </li> <li> <p>Update <code>wasmtime@latest</code> to 38.0.3.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.17.</p> </li> <li> <p>Update <code>cargo-tarpaulin@latest</code> to 0.34.1.</p> </li> </ul> <h2>[2.62.37] - 2025-10-24</h2> <ul> <li> <p>Update <code>cargo-binstall@latest</code> to 1.15.8.</p> </li> <li> <p>Update <code>zizmor@latest</code> to 1.16.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.16.</p> </li> </ul> <h2>[2.62.36] - 2025-10-23</h2> <ul> <li> <p>Update <code>syft@latest</code> to 1.36.0.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.19.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.15.</p> </li> </ul> <h2>[2.62.35] - 2025-10-22</h2> <ul> <li> <p>Update <code>wasmtime@latest</code> to 38.0.2.</p> </li> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.108.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.10.14.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.18.9.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/c5b1b6f479c32f356cc6f4ba672a47f63853b13b"><code>c5b1b6f</code></a> Release 2.62.38</li> <li><a href="https://github.com/taiki-e/install-action/commit/7cd74f6aac6a2a6c13632c29a30ffc0ef8053cf2"><code>7cd74f6</code></a> Update <code>coreutils@latest</code> to 0.3.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/def9901333773abdceeb414c2c2a68cc4276eea9"><code>def9901</code></a> Update <code>wasmtime@latest</code> to 38.0.3</li> <li><a href="https://github.com/taiki-e/install-action/commit/a9d3853729137d6a76fdb344e3fdba064bb51dd5"><code>a9d3853</code></a> Update coreutils manifest</li> <li><a href="https://github.com/taiki-e/install-action/commit/958d48b0c9eb6cf8c0edca899e787eb73a91794c"><code>958d48b</code></a> Update <code>mise@latest</code> to 2025.10.17</li> <li><a href="https://github.com/taiki-e/install-action/commit/fb485991fd79e393a6a4e3715369bdd7a96fc12d"><code>fb48599</code></a> Update <code>cargo-tarpaulin@latest</code> to 0.34.1</li> <li><a href="https://github.com/taiki-e/install-action/commit/1c7b1d35fcc8f6525be0cbdacbf5977079a3f94c"><code>1c7b1d3</code></a> Release 2.62.37</li> <li><a href="https://github.com/taiki-e/install-action/commit/18cba62798fa05dd5849e62a3759a8ef249feefc"><code>18cba62</code></a> Update <code>cargo-binstall@latest</code> to 1.15.8</li> <li><a href="https://github.com/taiki-e/install-action/commit/f3c0c6962aed40004323e265015332d9d9cf90f9"><code>f3c0c69</code></a> Update <code>zizmor@latest</code> to 1.16.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/99fc3e5b1e80c12d05e5bff5af81a035ab4e98b5"><code>99fc3e5</code></a> Update <code>mise@latest</code> to 2025.10.16</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/ebb229c6baa68383264f2822689b07b4916d9177...c5b1b6f479c32f356cc6f4ba672a47f63853b13b">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…18051) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#11336 Since this is my first contribution, I suppose to mention @alamb , author of the Issue apache#11336 Could you please trigger the CI? Thanks! ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> The Arrow introduction guide (apache#11336) needed improvements to make it more accessible for newcomers while providing better navigation to advanced topics. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Issue apache#11336 requested a gentle introduction to Apache Arrow and RecordBatches to help DataFusion users understand the foundational concepts. This PR enhances the existing Arrow introduction guide with clearer explanations, practical examples, visual aids, and comprehensive navigation links to make it more accessible for newcomers while providing pathways to advanced topics. Was unsure if this fits to `docs/source/user-guide/dataframe.md' ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> applyed prettier, like described. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> Yes - improved documentation for the Arrow introduction guide at `docs/source/user-guide/arrow-introduction.md` <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Martin <your.email@example.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Bumps [regex](https://github.com/rust-lang/regex) from 1.11.3 to 1.12.2. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/rust-lang/regex/blob/master/CHANGELOG.md">regex's changelog</a>.</em></p> <blockquote> <h1>1.12.2 (2025-10-13)</h1> <p>This release fixes a <code>cargo doc</code> breakage on nightly when <code>--cfg docsrs</code> is enabled. This caused documentation to fail to build on docs.rs.</p> <p>Bug fixes:</p> <ul> <li>[BUG <a href="https://redirect.github.com/rust-lang/regex/issues/1305">#1305</a>](<a href="https://redirect.github.com/rust-lang/regex/issues/1305">rust-lang/regex#1305</a>): Switches the <code>doc_auto_cfg</code> feature to <code>doc_cfg</code> on nightly for docs.rs builds.</li> </ul> <h1>1.12.1 (2025-10-10)</h1> <p>This release makes a bug fix in the new <code>regex::Captures::get_match</code> API introduced in <code>1.12.0</code>. There was an oversight with the lifetime parameter for the <code>Match</code> returned. This is technically a breaking change, but given that it was caught almost immediately and I've yanked the <code>1.12.0</code> release, I think this is fine.</p> <h1>1.12.0 (2025-10-10)</h1> <p>This release contains a smattering of bug fixes, a fix for excessive memory consumption in some cases and a new <code>regex::Captures::get_match</code> API.</p> <p>Improvements:</p> <ul> <li>[FEATURE <a href="https://redirect.github.com/rust-lang/regex/issues/1146">#1146</a>](<a href="https://redirect.github.com/rust-lang/regex/issues/1146">rust-lang/regex#1146</a>): Add <code>Capture::get_match</code> for returning the overall match without <code>unwrap()</code>.</li> </ul> <p>Bug fixes:</p> <ul> <li>[BUG <a href="https://redirect.github.com/rust-lang/regex/issues/1083">#1083</a>](<a href="https://redirect.github.com/rust-lang/regex/issues/1083">rust-lang/regex#1083</a>): Fixes a panic in the lazy DFA (can only occur for especially large regexes).</li> <li>[BUG <a href="https://redirect.github.com/rust-lang/regex/issues/1116">#1116</a>](<a href="https://redirect.github.com/rust-lang/regex/issues/1116">rust-lang/regex#1116</a>): Fixes a memory usage regression for large regexes (introduced in <code>regex 1.9</code>).</li> <li>[BUG <a href="https://redirect.github.com/rust-lang/regex/issues/1195">#1195</a>](<a href="https://redirect.github.com/rust-lang/regex/issues/1195">rust-lang/regex#1195</a>): Fix universal start states in sparse DFA.</li> <li>[BUG <a href="https://redirect.github.com/rust-lang/regex/issues/1295">#1295</a>](<a href="https://redirect.github.com/rust-lang/regex/pull/1295">rust-lang/regex#1295</a>): Fixes a panic when deserializing a corrupted dense DFA.</li> <li><a href="https://github.com/rust-lang/regex/commit/8f5d9479d0f1da5726488a530d7fd66a73d05b80">BUG 8f5d9479</a>: Make <code>regex_automata::meta::Regex::find</code> consistently return <code>None</code> when <code>WhichCaptures::None</code> is used.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/rust-lang/regex/commit/5ea3eb1e95f0338e283f5f0b4681f0891a1cd836"><code>5ea3eb1</code></a> 1.12.2</li> <li><a href="https://github.com/rust-lang/regex/commit/ab0b07171b82d1d4fdc8359505d12b2e818514d4"><code>ab0b071</code></a> regex-automata-0.4.13</li> <li><a href="https://github.com/rust-lang/regex/commit/691d51457db276bbdf9ca3de2cafe285c662c59f"><code>691d514</code></a> regex-syntax-0.8.8</li> <li><a href="https://github.com/rust-lang/regex/commit/1dd90777791dbc6bbf389157d05ac8176c6ad051"><code>1dd9077</code></a> docs: swap <code>doc_auto_cfg</code> with <code>doc_cfg</code></li> <li><a href="https://github.com/rust-lang/regex/commit/0089034cb37b0bf3785f2e0211f7eca74033f4d1"><code>0089034</code></a> regex-cli-0.2.3</li> <li><a href="https://github.com/rust-lang/regex/commit/140f8949da3f575490bac80ff23dfc29458b82c7"><code>140f894</code></a> regex-lite-0.1.8</li> <li><a href="https://github.com/rust-lang/regex/commit/27d6d65263cb80266a62e3189408a44f201a0975"><code>27d6d65</code></a> 1.12.1</li> <li><a href="https://github.com/rust-lang/regex/commit/85398ad5002048bbeaa90f1fe37fbb31df2bc0d6"><code>85398ad</code></a> changelog: 1.12.1</li> <li><a href="https://github.com/rust-lang/regex/commit/764efbd305d3a7b817ec8892ff0a656ec657d660"><code>764efbd</code></a> api: tweak the lifetime of <code>Captures::get_match</code></li> <li><a href="https://github.com/rust-lang/regex/commit/ee6aa55e01786e4d2c11eb1be805835bbb3bfa99"><code>ee6aa55</code></a> rure-0.2.4</li> <li>Additional commits viewable in <a href="https://github.com/rust-lang/regex/compare/1.11.3...1.12.2">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [clap](https://github.com/clap-rs/clap) from 4.5.48 to 4.5.50. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/clap-rs/clap/releases">clap's releases</a>.</em></p> <blockquote> <h2>v4.5.50</h2> <h2>[4.5.50] - 2025-10-20</h2> <h3>Features</h3> <ul> <li>Accept <code>Cow</code> where <code>String</code> and <code>&str</code> are accepted</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/clap-rs/clap/blob/master/CHANGELOG.md">clap's changelog</a>.</em></p> <blockquote> <h2>[4.5.50] - 2025-10-20</h2> <h3>Features</h3> <ul> <li>Accept <code>Cow</code> where <code>String</code> and <code>&str</code> are accepted</li> </ul> <h2>[4.5.49] - 2025-10-13</h2> <h3>Fixes</h3> <ul> <li><em>(help)</em> Correctly wrap when ANSI escape codes are present</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/clap-rs/clap/commit/d8acd4729878ca72d305e6cf7adf7acc0da36738"><code>d8acd47</code></a> chore: Release</li> <li><a href="https://github.com/clap-rs/clap/commit/7c2b8d9ad4d22650f969bd455d80b4181a7e25ff"><code>7c2b8d9</code></a> docs: Update changelog</li> <li><a href="https://github.com/clap-rs/clap/commit/e69a2ea55bc9076d95caf60d79e481581f688724"><code>e69a2ea</code></a> Merge pull request <a href="https://redirect.github.com/clap-rs/clap/issues/5987">#5987</a> from mernen/fix-bash-comp-words-loop</li> <li><a href="https://github.com/clap-rs/clap/commit/e03cc2e798183e9528f53d42d8b2699f034fc667"><code>e03cc2e</code></a> Merge pull request <a href="https://redirect.github.com/clap-rs/clap/issues/5988">#5988</a> from cordx56/fix-builder-custom-version-docs</li> <li><a href="https://github.com/clap-rs/clap/commit/5ab2579844a47a26b4567f77a7b9d198be006f0a"><code>5ab2579</code></a> fix: Minor fix for builder docs about version</li> <li><a href="https://github.com/clap-rs/clap/commit/2f66432721bd24602455dc3e31765195c6107c34"><code>2f66432</code></a> fix(complete): Only parse arguments before current</li> <li><a href="https://github.com/clap-rs/clap/commit/4d9d2100f75693645ea68180ed4b6b3ecacb9923"><code>4d9d210</code></a> test(complete): Illustrate current behavior in Bash</li> <li><a href="https://github.com/clap-rs/clap/commit/6abe2f8c61e31d8d43fee42c18414926c60893be"><code>6abe2f8</code></a> chore: Release</li> <li><a href="https://github.com/clap-rs/clap/commit/d5c74542ce628b57424caec88efee1a231c436a0"><code>d5c7454</code></a> docs: Update changelog</li> <li><a href="https://github.com/clap-rs/clap/commit/5b2e960267b94d4811c9c3b99c62899a87505413"><code>5b2e960</code></a> Merge pull request <a href="https://redirect.github.com/clap-rs/clap/issues/5985">#5985</a> from mernen/bash-cur</li> <li>Additional commits viewable in <a href="https://github.com/clap-rs/clap/compare/clap_complete-v4.5.48...clap_complete-v4.5.50">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close? - Related to apache/arrow-rs#7835 - Closes apache#3666 Note while this PR looks massive, a large portion is display updates due to better display of Fields and DataTypes ## Rationale for this change Upgrade to the latest arrow Also, there are several new features in arrow-57 that I want to be able to test including Variant, arrow-avro, and a new parquet metadata reader. ## What changes are included in this PR? 1. Update arrow/parquet 2. Update prost 3. Update substrait 4. Update pbjson 5. Make API changes to avoid deprecated APIs ## Are these changes tested? By CI ## Are there any user-facing changes? New arrow
Bumps [syn](https://github.com/dtolnay/syn) from 2.0.106 to 2.0.108. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/dtolnay/syn/releases">syn's releases</a>.</em></p> <blockquote> <h2>2.0.108</h2> <ul> <li>Parse unrecognized or invalid literals as Lit::Verbatim (<a href="https://redirect.github.com/dtolnay/syn/issues/1925">#1925</a>)</li> </ul> <h2>2.0.107</h2> <ul> <li>Improve panic message when constructing a LitInt, LitFloat, or Lit from invalid syntax (<a href="https://redirect.github.com/dtolnay/syn/issues/1917">#1917</a>)</li> <li>Improve panic message on Punctuated index out of bounds (<a href="https://redirect.github.com/dtolnay/syn/issues/1922">#1922</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/dtolnay/syn/commit/7a7e331255822d49bea01e29c326ee7a5cd5415c"><code>7a7e331</code></a> Release 2.0.108</li> <li><a href="https://github.com/dtolnay/syn/commit/30463afa201abc30e086bd1fb1deb714eb8910f4"><code>30463af</code></a> Merge pull request <a href="https://redirect.github.com/dtolnay/syn/issues/1926">#1926</a> from dtolnay/litfuzz</li> <li><a href="https://github.com/dtolnay/syn/commit/1cc9167f60d209865e91bf73a949d25914e6bf18"><code>1cc9167</code></a> Add fuzzer for literal parsing</li> <li><a href="https://github.com/dtolnay/syn/commit/c49e1d3a65ab423beee54ed730ea3f849ec49e0b"><code>c49e1d3</code></a> Merge pull request <a href="https://redirect.github.com/dtolnay/syn/issues/1925">#1925</a> from dtolnay/litparse</li> <li><a href="https://github.com/dtolnay/syn/commit/d047536103b7edfb0408dab8ec65cde19e73a88f"><code>d047536</code></a> Report unexpected verbatim literals in test</li> <li><a href="https://github.com/dtolnay/syn/commit/ce9776747974555e30cd890b9e1d3030e02efc13"><code>ce97767</code></a> Parse unrecognized or invalid literals as Lit::Verbatim</li> <li><a href="https://github.com/dtolnay/syn/commit/e4a8957feb1b86e6da4309c9886ca15ddfd7b7ad"><code>e4a8957</code></a> Release 2.0.107</li> <li><a href="https://github.com/dtolnay/syn/commit/1792e83acfcc4810ccca70c22952986a6ea09d7e"><code>1792e83</code></a> Merge pull request <a href="https://redirect.github.com/dtolnay/syn/issues/1922">#1922</a> from dtolnay/outofbounds</li> <li><a href="https://github.com/dtolnay/syn/commit/532e4af53355f8c4585251e1507336bed8d39f14"><code>532e4af</code></a> Improve panic message on Punctuated index out of bounds</li> <li><a href="https://github.com/dtolnay/syn/commit/909c2221dd582e18f748988384e8ec4edd7544cf"><code>909c222</code></a> Add test of Punctuated indexing</li> <li>Additional commits viewable in <a href="https://github.com/dtolnay/syn/compare/2.0.106...2.0.108">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
reggieross
approved these changes
Oct 28, 2025
nathaniel-d-ef
added a commit
that referenced
this pull request
Oct 28, 2025
This reverts commit 838222c.
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
catalog
common
core
datasource
development-process
documentation
Improvements or additions to documentation
execution
functions
logical-expr
optimizer
physical-expr
physical-plan
proto
spark
sql
sqllogictest
substrait
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
This brings in the upgrade to Arrow 57 which includes our changes in arrow-avro.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?