Skip to content

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Aug 4, 2025

Fixes #16998. Closes #17016.

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels Aug 4, 2025
@adriangb adriangb force-pushed the fix-bug branch 3 times, most recently from dbcd08c to f26b319 Compare August 4, 2025 05:06
@adriangb adriangb requested a review from alamb August 4, 2025 05:35
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @adriangb -- this PR makes sense to me except for the expect / panic -- otherwise I think it is good to go

"implementations have drifted and this is no longer safe even if `new()` still works, ",
"for example if `new()` now does something different than just calling `compute_properties(...).unwrap()`",
"\n",
"This is clearly a bug, please report it!"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return DatafusionError::Internal here rather than panic'ing as it is a better UX (if you want fail fast in debug builds, perhaps you could add a debug_assert)

I also recommend converting the explanation into comments and leaving the panic message like "Internal inconsistency in SortExec"

The rationale is that if a user sees this message it is not going to mean anything to them and they can't fix it, and this text will obscure the conclusion (this is a bug they can not do anything to fix). A developer will come to the source location and can read the comment.

Copy link
Contributor Author

@adriangb adriangb Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb this already had the possibility to panic in it because it called SortExec::new():

impl SortExec {
/// Create a new sort execution plan that produces a single,
/// sorted output partition.
pub fn new(expr: LexOrdering, input: Arc<dyn ExecutionPlan>) -> Self {
let preserve_partitioning = false;
let (cache, sort_prefix) =
Self::compute_properties(&input, expr.clone(), preserve_partitioning)
.unwrap();

fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>> {
let mut new_sort = SortExec::new(self.expr.clone(), Arc::clone(&children[0]))

Returning a Result would be a major breaking change to the ExecutionPlan::with_new_children API which I don't think we should do in this PR.

I do think ExecutionPlan::with_new_children returning a Result would be a good thing. In general I think trait methods should err on the side of returning a result in case some implementation needs to. If none of them do I'd expect compilation to make it pretty much a non issue for performance. But maybe let's do that as it's own PR if we really want to.

/// Note to implementers: unlike [`ExecutionPlan::with_new_children`] this method does not accept new children as an argument,
/// thus it is expected that any cached plan properties will remain valid after the reset.
///
/// [`DynamicFilterPhysicalExpr`]: datafusion_physical_expr::expressions::DynamicFilterPhysicalExpr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should also add a note to DynamicFilterPhysicalExpr saying any ExecutionPlan that uses them should also implement reset_state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Aug 4, 2025
Comment on lines 1137 to 1141
let (cache, sort_prefix) = Self::compute_properties(
&new_sort.input,
new_sort.expr.clone(),
new_sort.preserve_partitioning,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we put the logic into reset_state if the changes are due to the reset_state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for this change is:

  1. We need to do similar work as with_new_children (i.e. clone SortExec) but each method has slightly different requirements (with_new_children needs to reset cache while reset_state needs to reset filter.
  2. To solve this I created the new SortExec::cloned which does neither of those two things and moved the resetting of cache into with_new_children and the resetting of filter into reset_state.

In other words, it doesn't make sense to put Self::compute_properties(...) in reset_state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function returns a Result though, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point yep I'll do that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed your original point

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend putting the PR into upgrading doc.

@adriangb
Copy link
Contributor Author

adriangb commented Aug 5, 2025

I recommend putting the PR into upgrading doc.

Good suggestion, added!

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Aug 5, 2025
@adriangb
Copy link
Contributor Author

adriangb commented Aug 5, 2025

Are we concerned that reset_state will disable dynamic the dynamic filter optimization? I wasn't able to cook up an example query, I don't do much recursive queries myself, but I imagine that unless the optimizer runs again after reset_state to re-connect dynamic filters to a data source they can't possibly work. Right?

@alamb
Copy link
Contributor

alamb commented Aug 6, 2025

Are we concerned that reset_state will disable dynamic the dynamic filter optimization? I wasn't able to cook up an example query, I don't do much recursive queries myself, but I imagine that unless the optimizer runs again after reset_state to re-connect dynamic filters to a data source they can't possibly work. Right?

I suggest we file a ticket to look into this in the future for anyone who might be interested in that usecase, and not worry about it in this PR

@adriangb
Copy link
Contributor Author

adriangb commented Aug 6, 2025

done! #17060

should we wait for anything else before merging this?

@alamb
Copy link
Contributor

alamb commented Aug 7, 2025

done! #17060

should we wait for anything else before merging this?

Nope, I think we are good -- let's merge it!

@alamb alamb merged commit f9efba0 into apache:main Aug 7, 2025
28 checks passed
@alamb
Copy link
Contributor

alamb commented Aug 8, 2025

It seems like we should probably backport this to branch-49 for the 49.0.1 release

@alamb alamb mentioned this pull request Aug 8, 2025
5 tasks
adriangb added a commit to pydantic/datafusion that referenced this pull request Aug 8, 2025
* Add ExecutionPlan::reset_state

Co-authored-by: Robert Ream <robert@stably.io>

* Update datafusion/sqllogictest/test_files/cte.slt

* Add reference

* fmt

* add to upgrade guide

* add explain plan, implement in more plans

* fmt

* only explain

---------

Co-authored-by: Robert Ream <robert@stably.io>
alamb pushed a commit that referenced this pull request Aug 8, 2025
* Add ExecutionPlan::reset_state



* Update datafusion/sqllogictest/test_files/cte.slt

* Add reference

* fmt

* add to upgrade guide

* add explain plan, implement in more plans

* fmt

* only explain

---------

Co-authored-by: Robert Ream <robert@stably.io>
@adriangb adriangb deleted the fix-bug branch August 8, 2025 20:27
LiaCastaneda pushed a commit to DataDog/datafusion that referenced this pull request Sep 2, 2025
* Add ExecutionPlan::reset_state

Co-authored-by: Robert Ream <robert@stably.io>

* Update datafusion/sqllogictest/test_files/cte.slt

* Add reference

* fmt

* add to upgrade guide

* add explain plan, implement in more plans

* fmt

* only explain

---------

Co-authored-by: Robert Ream <robert@stably.io>
LiaCastaneda added a commit to DataDog/datafusion that referenced this pull request Sep 9, 2025
* Enable physical filter pushdown for hash joins (apache#16954)

(cherry picked from commit b10f453)

* Add ExecutionPlan::reset_state (apache#17028)

* Add ExecutionPlan::reset_state

Co-authored-by: Robert Ream <robert@stably.io>

* Update datafusion/sqllogictest/test_files/cte.slt

* Add reference

* fmt

* add to upgrade guide

* add explain plan, implement in more plans

* fmt

* only explain

---------

Co-authored-by: Robert Ream <robert@stably.io>

* Add dynamic filter (bounds) pushdown to HashJoinExec (apache#16445)

(cherry picked from commit ff77b70)

* Push dynamic pushdown through CooperativeExec and ProjectionExec (apache#17238)

(cherry picked from commit 4bc0696)

* Fix dynamic filter pushdown in HashJoinExec (apache#17201)

(cherry picked from commit 1d4d74b)

* Fix HashJoinExec sideways information passing for partitioned queries (apache#17197)

(cherry picked from commit 64bc58d)

* disallow pushdown of volatile functions (apache#16861)

* dissallow pushdown of volatile PhysicalExprs

* fix

* add FilteredVec helper to handle filter / remap pattern (#34)

* checkpoint: Address PR feedback in https://github.com/apach...

* add FilteredVec to consolidate handling of filter / remap pattern

* lint

* Add slt test for pushing volatile predicates down (#35)

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
(cherry picked from commit 94e8548)

* fix bounds accumulator reset in HashJoinExec dynamic filter pushdown (apache#17371)

---------

Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
Co-authored-by: Robert Ream <robert@stably.io>
Co-authored-by: Jack Kleeman <jackkleeman@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shared DynamicFilterPhysicalExpr causes recursive queries to fail

4 participants