Skip to content

Conversation

crepererum
Copy link

@crepererum crepererum commented Jun 24, 2025

Patches

Patches map to commits 1:1 (i.e. every patch is exactly 1 commit) and are ordered for easier correlation of the description and the respective commits. They are also grouped in 3 stages.

A: Dummy

No actual patches, can be dropped at any point:

  1. a dummy patch just to get "a diff" to the base branch

B: CI Fixes

Need to get CI up and running before picking any actual patches:

  1. Disable sccache action to fix gh cache issue apache/datafusion#15536:
    can be dropped when we upgrade to DF version 47

All commits afterwards should build cleanly!

C: Patches

These are the actual relevant patches:

  1. chore: default=true for skip_physical_aggregate_schema_check, and add warn logging:
    until we chase down all warnings in our iox logs (see https://github.com/influxdata/influxdb_iox/issues/12404 )
  2. (New) Test + workaround for SanityCheck plan:
    according to this slack thread, we can drop this with DataFusion version 49.
  3. chore: skip order calculation / exponential planning:
    workaround for Exponential planning time (100s of seconds) with UNION and ORDER BY queries apache/datafusion#13748 -- which should be fixed in DataFusion version 49
  4. fix: temporary fix to handle incorrect coalesce (inserted during EnforceDistribution) which later causes an error during EnforceSort (without our patch). The next DataFusion version 46 upgrade does the proper fix, which is to not insert the coalesce in the first place.:
    There is EAR-5822 (also see https://github.com/influxdata/influxdb_iox/issues/13310 ) despite what the note in Patched DataFusion version 45.0.0 #54 and ParallelizeSorts, a subrule of EnforceSorting optimizer, should not remove necessary coalesce. apache/datafusion#14691 (comment) say, this is still required for DF version 46. Otherwise the regression test fails. Also see this slack thread.

@crepererum crepererum marked this pull request as draft June 24, 2025 08:07
@crepererum crepererum changed the title ci: test CI Patched DF 46.0.1 (take 2) Jun 24, 2025
* Bump sccache version to latest to fix gh cache issue.

* version blocked, trying with a hash

* disable sccache.
@github-actions github-actions bot added documentation Improvements or additions to documentation sqllogictest core common labels Jun 24, 2025
@crepererum crepererum force-pushed the upgrade-df-ver4601-b branch from 96e7154 to caed48d Compare June 24, 2025 12:54
…rceDistribution) which later causes an error during EnforceSort (without our patch). The next DataFusion version 46 upgrade does the proper fix, which is to not insert the coalesce in the first place.

test: recreating the iox plan:
* demonstrate the insertion of coalesce after the use of column estimates, and the removal of the test scenario's forcing of rr repartitioning

test: reproducer of SanityCheck failure after EnforceSorting removes the coalesce added in the EnforceDistribution

fix: special case to not remove the needed coalesce
@crepererum
Copy link
Author

Ok, the MSRV is failing. We don't care about that one for our fork.

Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 25, 2025
@crepererum
Copy link
Author

no longer needed.

@crepererum crepererum closed this Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants