fix: synchronize partition bounds reporting in HashJoin #17452

rkrishn7 · 2025-09-06T06:30:49Z

Which issue does this PR close?

Closes Ensure dynamic filter expr is built before fetching probe batch in HashJoin #17451

What changes are included in this PR?

Adds a simple synchronization mechanism (BoundsWaiter) for tasks to wait on complete bounds for left side and for the dynamic filter expr to be built.

Are these changes tested?

Added regression check for number of output rows from probe side scan (see #17452 (comment))

Are there any user-facing changes?

N/A

…ener

…ling right side

rkrishn7 · 2025-09-06T06:31:52Z

datafusion/core/tests/physical_optimizer/filter_pushdown/util.rs

        _file_meta: FileMeta,
        _file: PartitionedFile,
    ) -> Result<FileOpenFuture> {
+        if let Some(predicate) = &self.predicate {


Will take this out, just leaving here for now for visualization.

I wonder what are good ways to regression test this 🤔

The best I can think of would be to make a custom ExecutionPlan that delays partition 0 for 1s or something and check that we still read the expected number of rows on the probe side

Hey @adriangb, I modified the existing partitioned test to inspect metrics from the probe side scan and verify rows are correctly being filtered out there. I think we can get away without the delay since I applied the patch to main and the test became flaky, as expected. Does that sound good to you or did you have something else in mind?

adriangb · 2025-09-07T15:14:24Z

@rkrishn7 I've looked over the implementation - super clean, very nice stuff! I think we just need some regression tests. Can you take a look at my suggestion in #17452 (comment) and see if that works?

… side scan

rkrishn7 · 2025-09-08T03:39:13Z

datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs

    -   DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[a, x], file_type=test, pushdown_supported=true
    -   HashJoinExec: mode=Partitioned, join_type=Inner, on=[(c@1, d@0)]
    -     DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[b, c, y], file_type=test, pushdown_supported=true, predicate=DynamicFilterPhysicalExpr [ b@0 >= aa AND b@0 <= ab ]
-    -     DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[d, z], file_type=test, pushdown_supported=true, predicate=DynamicFilterPhysicalExpr [ d@0 >= ca AND d@0 <= ce ]


@adriangb This update seems correct to me

Hmm the filter is more selective (generally a good thing) but I'm curious why it changed, it's not immediately coming to me why waiting would change the value of the filter. Could you help me understand why that is?

Ah I believe this is simply due to the fact that the filter is now applied in TestOpener. The bounds on the left side have changed due to rows being filtered out so we're seeing the correct filter now.

adriangb · 2025-09-08T04:06:54Z

I'll give this a final review tomorrow (Monday)!

adriangb

@rkrishn7 this PR looks excellent to me

My only concern is about performance in edge cases. I do think for a lot of queries this will be an improvement, but I fear that in some cases (e.g. if the filter was not selective and the distribution amongst partitions is very uneven) the synchronization might hurt performance.

I propose that @alamb kick off a set of benchmarks and if we see no regressions we move forward with this. It's an internal revertible change so if down the road we find that the cons outweigh the pros (I doubt it) we can always roll it back.

adriangb · 2025-09-08T12:41:10Z

datafusion/core/tests/physical_optimizer/filter_pushdown/util.rs

            let mut new_batches = Vec::new();
            for batch in batches {
+                let batch = if let Some(predicate) = &self.predicate {
+                    batch_filter(&batch, predicate)?


Nice I didn't know this little function existed!

adriangb · 2025-09-08T12:43:41Z

datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs

    -   DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[a, x], file_type=test, pushdown_supported=true
    -   HashJoinExec: mode=Partitioned, join_type=Inner, on=[(c@1, d@0)]
    -     DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[b, c, y], file_type=test, pushdown_supported=true, predicate=DynamicFilterPhysicalExpr [ b@0 >= aa AND b@0 <= ab ]
-    -     DataSourceExec: file_groups={1 group: [[test.parquet]]}, projection=[d, z], file_type=test, pushdown_supported=true, predicate=DynamicFilterPhysicalExpr [ d@0 >= ca AND d@0 <= ce ]


Hmm the filter is more selective (generally a good thing) but I'm curious why it changed, it's not immediately coming to me why waiting would change the value of the filter. Could you help me understand why that is?

alamb · 2025-09-09T12:55:42Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix/wait-until-partition-bounds-reported (c24ef79) to b4a8b5a diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2025-09-09T12:55:49Z

I propose that @alamb kick off a set of benchmarks and if we see no regressions we move forward with this. It's an internal revertible change so if down the road we find that the cons outweigh the pros (I doubt it) we can always roll it back.

done!

alamb

I also spent some time reviewing this code and it makes sense to me. I would be great to have some tests but I don't have a great suggestion for that

Thank you @rkrishn7 and @adriangb

alamb · 2025-09-09T13:04:29Z

datafusion/physical-plan/src/joins/hash_join/shared_bounds.rs

    }
 }
+
+/// Utility future to wait until all partitions have reported completion


This looks very similar to a Barrier -- https://docs.rs/tokio/latest/tokio/sync/struct.Barrier.html

Maybe as a follow on PR we could simplify the code a bit by potentially using that pre-defied structure

Agreed - @rkrishn7 could you refactor this to use Barrier?

Definitely, I was reaching for Barrier at first but I didn't see many tokio sync primitives being used in the crate so I decided against it. Agreed that using Barrier is preferable

Would you rather push to this PR or do it as a followup? Either is good to me.

I'll handle in this PR! Should get it done shortly

@alamb @adriangb Done ✅

I think it is looking pretty sweet now that we have the barrier in there

Yeah this is some very nice code @rkrishn7 !

alamb · 2025-09-09T13:52:39Z

🤖: Benchmark completed

Details

Comparing HEAD and fix_wait-until-partition-bounds-reported
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ fix_wait-until-partition-bounds-reported ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2729.13 ms │                               2769.08 ms │ no change │
│ QQuery 1     │  1211.24 ms │                               1254.20 ms │ no change │
│ QQuery 2     │  2387.44 ms │                               2377.36 ms │ no change │
│ QQuery 3     │  1203.78 ms │                               1177.84 ms │ no change │
│ QQuery 4     │  2262.85 ms │                               2265.06 ms │ no change │
│ QQuery 5     │ 27547.53 ms │                              27678.39 ms │ no change │
│ QQuery 6     │  4295.91 ms │                               4278.10 ms │ no change │
│ QQuery 7     │  3719.50 ms │                               3592.64 ms │ no change │
└──────────────┴─────────────┴──────────────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                       ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                       │ 45357.36ms │
│ Total Time (fix_wait-until-partition-bounds-reported)   │ 45392.68ms │
│ Average Time (HEAD)                                     │  5669.67ms │
│ Average Time (fix_wait-until-partition-bounds-reported) │  5674.08ms │
│ Queries Faster                                          │          0 │
│ Queries Slower                                          │          0 │
│ Queries with No Change                                  │          8 │
│ Queries with Failure                                    │          0 │
└─────────────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ fix_wait-until-partition-bounds-reported ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.04 ms │                                  2.18 ms │  1.07x slower │
│ QQuery 1     │    48.30 ms │                                 49.23 ms │     no change │
│ QQuery 2     │   135.20 ms │                                136.85 ms │     no change │
│ QQuery 3     │   159.94 ms │                                171.46 ms │  1.07x slower │
│ QQuery 4     │  1003.67 ms │                               1182.09 ms │  1.18x slower │
│ QQuery 5     │  1456.25 ms │                               1594.36 ms │  1.09x slower │
│ QQuery 6     │     2.15 ms │                                  2.27 ms │  1.05x slower │
│ QQuery 7     │    54.72 ms │                                 54.40 ms │     no change │
│ QQuery 8     │  1394.69 ms │                               1544.75 ms │  1.11x slower │
│ QQuery 9     │  1797.64 ms │                               1984.12 ms │  1.10x slower │
│ QQuery 10    │   387.33 ms │                                389.05 ms │     no change │
│ QQuery 11    │   429.42 ms │                                443.07 ms │     no change │
│ QQuery 12    │  1326.44 ms │                               1485.86 ms │  1.12x slower │
│ QQuery 13    │  2099.97 ms │                               2276.44 ms │  1.08x slower │
│ QQuery 14    │  1234.81 ms │                               1331.83 ms │  1.08x slower │
│ QQuery 15    │  1142.33 ms │                               1294.07 ms │  1.13x slower │
│ QQuery 16    │  2543.10 ms │                               2738.82 ms │  1.08x slower │
│ QQuery 17    │  2515.98 ms │                               2739.37 ms │  1.09x slower │
│ QQuery 18    │  5068.98 ms │                               5090.39 ms │     no change │
│ QQuery 19    │   127.18 ms │                                129.36 ms │     no change │
│ QQuery 20    │  2060.73 ms │                               2078.11 ms │     no change │
│ QQuery 21    │  2354.13 ms │                               2408.36 ms │     no change │
│ QQuery 22    │  4005.57 ms │                               4050.08 ms │     no change │
│ QQuery 23    │ 12749.51 ms │                              12881.11 ms │     no change │
│ QQuery 24    │   228.03 ms │                                213.41 ms │ +1.07x faster │
│ QQuery 25    │   518.13 ms │                                510.26 ms │     no change │
│ QQuery 26    │   226.24 ms │                                223.45 ms │     no change │
│ QQuery 27    │  2882.18 ms │                               2884.58 ms │     no change │
│ QQuery 28    │ 23381.94 ms │                              24776.31 ms │  1.06x slower │
│ QQuery 29    │   993.29 ms │                                986.32 ms │     no change │
│ QQuery 30    │  1324.40 ms │                               1381.89 ms │     no change │
│ QQuery 31    │  1311.69 ms │                               1350.99 ms │     no change │
│ QQuery 32    │  4480.63 ms │                               5012.25 ms │  1.12x slower │
│ QQuery 33    │  5793.94 ms │                               5914.53 ms │     no change │
│ QQuery 34    │  5953.76 ms │                               5925.12 ms │     no change │
│ QQuery 35    │  2017.82 ms │                               2075.76 ms │     no change │
│ QQuery 36    │   125.13 ms │                                125.07 ms │     no change │
│ QQuery 37    │    52.98 ms │                                 52.48 ms │     no change │
│ QQuery 38    │   126.88 ms │                                124.09 ms │     no change │
│ QQuery 39    │   197.21 ms │                                201.56 ms │     no change │
│ QQuery 40    │    44.50 ms │                                 43.06 ms │     no change │
│ QQuery 41    │    39.86 ms │                                 38.58 ms │     no change │
│ QQuery 42    │    33.74 ms │                                 33.80 ms │     no change │
└──────────────┴─────────────┴──────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                       ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                       │ 93832.46ms │
│ Total Time (fix_wait-until-partition-bounds-reported)   │ 97931.10ms │
│ Average Time (HEAD)                                     │  2182.15ms │
│ Average Time (fix_wait-until-partition-bounds-reported) │  2277.47ms │
│ Queries Faster                                          │          1 │
│ Queries Slower                                          │         15 │
│ Queries with No Change                                  │         27 │
│ Queries with Failure                                    │          0 │
└─────────────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ fix_wait-until-partition-bounds-reported ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 168.24 ms │                                167.63 ms │     no change │
│ QQuery 2     │  25.72 ms │                                 27.34 ms │  1.06x slower │
│ QQuery 3     │  45.78 ms │                                 44.61 ms │     no change │
│ QQuery 4     │  26.57 ms │                                 26.71 ms │     no change │
│ QQuery 5     │  74.68 ms │                                 75.70 ms │     no change │
│ QQuery 6     │  18.99 ms │                                 19.21 ms │     no change │
│ QQuery 7     │ 146.68 ms │                                150.57 ms │     no change │
│ QQuery 8     │  33.53 ms │                                 31.32 ms │ +1.07x faster │
│ QQuery 9     │  85.72 ms │                                 90.16 ms │  1.05x slower │
│ QQuery 10    │  58.94 ms │                                 60.08 ms │     no change │
│ QQuery 11    │  41.26 ms │                                 42.60 ms │     no change │
│ QQuery 12    │  51.06 ms │                                 51.21 ms │     no change │
│ QQuery 13    │  47.20 ms │                                 47.82 ms │     no change │
│ QQuery 14    │  13.88 ms │                                 14.61 ms │  1.05x slower │
│ QQuery 15    │  24.33 ms │                                 24.23 ms │     no change │
│ QQuery 16    │  23.96 ms │                                 23.81 ms │     no change │
│ QQuery 17    │ 148.22 ms │                                149.04 ms │     no change │
│ QQuery 18    │ 324.96 ms │                                319.65 ms │     no change │
│ QQuery 19    │  36.91 ms │                                 38.25 ms │     no change │
│ QQuery 20    │  49.09 ms │                                 50.11 ms │     no change │
│ QQuery 21    │ 220.06 ms │                                223.42 ms │     no change │
│ QQuery 22    │  19.35 ms │                                 19.19 ms │     no change │
└──────────────┴───────────┴──────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                                       ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                                       │ 1685.12ms │
│ Total Time (fix_wait-until-partition-bounds-reported)   │ 1697.27ms │
│ Average Time (HEAD)                                     │   76.60ms │
│ Average Time (fix_wait-until-partition-bounds-reported) │   77.15ms │
│ Queries Faster                                          │         1 │
│ Queries Slower                                          │         3 │
│ Queries with No Change                                  │        18 │
│ Queries with Failure                                    │         0 │
└─────────────────────────────────────────────────────────┴───────────┘

rkrishn7 · 2025-09-09T17:48:25Z

@adriangb What are your thoughts on the benchmark run? I see some regressions 🤔

adriangb · 2025-09-09T17:54:39Z

@adriangb What are your thoughts on the benchmark run? I see some regressions 🤔

I think it's noise. For example, let's look at:

Benchmark clickbench_partitioned.json
QQuery 12    │  1326.44 ms │                               1485.86 ms │  1.12x slower

datafusion/benchmarks/queries/clickbench/queries/q12.sql

Line 4 in da1e7ae

    
           SELECT "SearchPhrase", COUNT(*) AS c FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;

Which does not obviously have a join. And I can confirm with the query plan that there are no joins:

> explain SELECT "SearchPhrase", COUNT(*) AS c FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
+---------------+-------------------------------+
| plan_type     | plan                          |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
|               | │  SortPreservingMergeExec  │ |
|               | │    --------------------   │ |
|               | │      c DESClimit: 10      │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       SortExec(TopK)      │ |
|               | │    --------------------   │ |
|               | │          c@1 DESC         │ |
|               | │                           │ |
|               | │         limit: 10         │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       ProjectionExec      │ |
|               | │    --------------------   │ |
|               | │       SearchPhrase:       │ |
|               | │        SearchPhrase       │ |
|               | │                           │ |
|               | │     c: count(Int64(1))    │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │       aggr: count(1)      │ |
|               | │                           │ |
|               | │         group_by:         │ |
|               | │        SearchPhrase       │ |
|               | │                           │ |
|               | │           mode:           │ |
|               | │      FinalPartitioned     │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │    CoalesceBatchesExec    │ |
|               | │    --------------------   │ |
|               | │     target_batch_size:    │ |
|               | │            8192           │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │      RepartitionExec      │ |
|               | │    --------------------   │ |
|               | │ partition_count(in->out): │ |
|               | │          12 -> 12         │ |
|               | │                           │ |
|               | │    partitioning_scheme:   │ |
|               | │ Hash([SearchPhrase@0], 12)│ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │       aggr: count(1)      │ |
|               | │                           │ |
|               | │         group_by:         │ |
|               | │        SearchPhrase       │ |
|               | │                           │ |
|               | │       mode: Partial       │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │    CoalesceBatchesExec    │ |
|               | │    --------------------   │ |
|               | │     target_batch_size:    │ |
|               | │            8192           │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │         FilterExec        │ |
|               | │    --------------------   │ |
|               | │         predicate:        │ |
|               | │      SearchPhrase !=      │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       DataSourceExec      │ |
|               | │    --------------------   │ |
|               | │         files: 111        │ |
|               | │      format: parquet      │ |
|               | │                           │ |
|               | │         predicate:        │ |
|               | │      SearchPhrase !=      │ |
|               | └───────────────────────────┘ |
|               |                               |
+---------------+-------------------------------+

So a regression in this query could not possibly be caused by the changes in this PR which are narrowly scoped to HashJoinExec.

adriangb · 2025-09-09T17:59:01Z

@rkrishn7 I merged #17444 which may cause some conflicts with this branch just FYI

rkrishn7 · 2025-09-09T18:26:10Z

@adriangb I think there is opportunity to simplify the bounds collection for each partition. That is, we can probably just track the min/max across all partitions and build a single AND binary expr once we have the final min/max (i.e. all partition bounds have been reported).

Aside from one less mutex, I think it'll help reduce output in EXPLAIN as well. Happy to tackle in a follow-up PR

adriangb · 2025-09-09T18:28:37Z

@adriangb I think there is opportunity to simplify the bounds collection for each partition. That is, we can probably just track the min/max across all partitions and build a single AND binary expr once we have the final min/max (i.e. all partition bounds have been reported).

Aside from one less mutex, I think it'll help reduce output in EXPLAIN as well. Happy to tackle in a follow-up PR

I think that will regress performance: imagine partition 1 has bounds (0, 1) and partition 2 has bounds (999998, 999999). With bounds per partition the value 1234 is filtered out. The merged bounds of (0, 999999) would include that value.

rkrishn7 · 2025-09-09T18:30:45Z

@adriangb I think there is opportunity to simplify the bounds collection for each partition. That is, we can probably just track the min/max across all partitions and build a single AND binary expr once we have the final min/max (i.e. all partition bounds have been reported).
Aside from one less mutex, I think it'll help reduce output in EXPLAIN as well. Happy to tackle in a follow-up PR

I think that will regress performance: imagine partition 1 has bounds (0, 1) and partition 2 has bounds (999998, 999999). With bounds per partition the value 1234 is filtered out. The merged bounds of (0, 999999) would include that value.

Ah yes 🤦🏾 , definitely. Good catch!

adriangb · 2025-09-09T18:32:27Z

@adriangb I think there is opportunity to simplify the bounds collection for each partition. That is, we can probably just track the min/max across all partitions and build a single AND binary expr once we have the final min/max (i.e. all partition bounds have been reported).
Aside from one less mutex, I think it'll help reduce output in EXPLAIN as well. Happy to tackle in a follow-up PR

I think that will regress performance: imagine partition 1 has bounds (0, 1) and partition 2 has bounds (999998, 999999). With bounds per partition the value 1234 is filtered out. The merged bounds of (0, 999999) would include that value.

Ah yes 🤦🏾 , definitely. Good catch!

This is the fundamental limitation of a min/max bounds approach. For some queries / datasets it's going to be very effective, for others not at all. Hence why we are discussing pushing down bloom filters, etc. But keeping the min/max per partition is at least a good compromise for now / seems to be working well.

adriangb · 2025-09-09T18:32:45Z

I plan to merge this PR once CI is green 🚀

rkrishn7 added 2 commits September 5, 2025 22:51

print check to observe indeterminate predicate propagation to file op…

fb0a153

…ener

wait until partition bounds are reported by all partitions before pol…

02053aa

…ling right side

github-actions bot added core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Sep 6, 2025

rkrishn7 commented Sep 6, 2025

View reviewed changes

adriangb mentioned this pull request Sep 6, 2025

Dynamic filters blog post (rev 2) apache/datafusion-site#103

Merged

cleanup + clarifying docs

ef2c773

rkrishn7 added 2 commits September 7, 2025 19:56

add check to verify predicate is always present when initiating probe…

35da826

… side scan

update snap

c24ef79

rkrishn7 commented Sep 8, 2025

View reviewed changes

adriangb approved these changes Sep 8, 2025

View reviewed changes

alamb approved these changes Sep 9, 2025

View reviewed changes

prefer barrier

2a41ced

adriangb merged commit da3d90a into apache:main Sep 9, 2025
28 checks passed

fix: synchronize partition bounds reporting in HashJoin #17452

fix: synchronize partition bounds reporting in HashJoin #17452

Conversation

rkrishn7 commented Sep 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rkrishn7 Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb commented Sep 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb commented Sep 8, 2025

Uh oh!

adriangb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Sep 9, 2025

Uh oh!

alamb commented Sep 9, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Sep 9, 2025

Uh oh!

rkrishn7 commented Sep 9, 2025

Uh oh!

adriangb commented Sep 9, 2025

Uh oh!

adriangb commented Sep 9, 2025

Uh oh!

rkrishn7 commented Sep 9, 2025

Uh oh!

adriangb commented Sep 9, 2025

Uh oh!

rkrishn7 commented Sep 9, 2025

Uh oh!

adriangb commented Sep 9, 2025

Uh oh!

adriangb commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

rkrishn7 commented Sep 6, 2025 •

edited

Loading

rkrishn7 Sep 8, 2025 •

edited

Loading