Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: df patched upgrade to 2024-03-05 #1

Closed

Conversation

wiedld
Copy link
Collaborator

@wiedld wiedld commented Apr 1, 2024

⚠️ WIP: will not be merged. ⚠️

Below is edited, since a patch was merged into main

What's in this branch:

When testing against iox, we have been finding patches needed in DF. This is a branch for datafusion through EOD 2024-03-05, and then layering on patches needed.

  1. Starting at datafusion main branch commit from March 5th 2024:

    commit ea01e56c3341dd4308a24e94091b86ee475ce224
    Author: Marko Milenković <milenkovicm@users.noreply.github.com>
    Date:   Tue Mar 5 23:05:19 2024 +0000
    
        Add plugable handler for `CREATE FUNCTION` (#9333)
    
  2. Then we added these commits:

  3. And a new patch, based upon a newly found bug:

    commit 4a387c8cb53994ae4a41afc14c7a73fafb7ffcf6
    Author: wiedld <wiedld@users.noreply.github.com>
    Date:   Sun Mar 31 05:09:06 2024 -0700
    
       fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. (#9871)
    
  4. This new patch^^, merged into DF main on 2024-03-31, no longer had 2 methods which existed at 2024-03-05. Therefore, those two methods were patched (just for this 2024-03-05 branch):

    commit 9af8ac27e1567ea6e7bf76643d398d71979f56a4
    Author: wiedld <dlw405@gmail.com>
    Date:   Mon Apr 1 08:53:20 2024 -0700
    
        refactor: incorporate ExprSet changes into the methods that no longer exist on main, but do exist at 2024-03-05
    
  5. The new DF patch (merged into main) also included a test using coalesce. This test relies upon a bug fix merged into main on March 7th (and not available on this 2024-03-05 branch). Added that patch too:

    commit 581e74785b876615d6a63db8c2e5ba372bf78828 (HEAD -> iox-10349/df-upgrade-with-patches, origin/iox-10349/df-upgrade-with-patches)
    Author: comphead <comphead@users.noreply.github.com>
    Date:   Thu Mar 21 11:07:40 2024 -0700
    
        build: modify code to comply with latest clippy requirement (#9725)
    
  6. add the clippy build fix

    commit 581e74785b876615d6a63db8c2e5ba372bf78828 (HEAD -> iox-10349/df-upgrade-with-patches, origin/iox-10349/df-upgrade-with-patches)
    Author: comphead <comphead@users.noreply.github.com>
    Date:   Thu Mar 21 11:07:40 2024 -0700
    
        build: modify code to comply with latest clippy requirement (#9725)
    

alamb and others added 10 commits March 28, 2024 12:29
… dictionaries (apache#9679)

* Add test for multiple count distincts on a dictionary

* Fix accumulator merge bug

* Fix cleanup code
…r common subexpr elimination optimization (apache#9685)

* test(9678): reproducer of short-circuiting causing expr elimination to error

* fix(9678): populate visited stack for short-circuited expressions, during the common-expr elimination optimization

* test(9678): reproducer for optimizer error (in common_subexpr_eliminate), as seen in other test case

* chore: extract id_array into abstraction, to make it more clear the relationship between the two visitors

* refactor: tweak the fix and make code more explicit (JumpMark, node_to_identifier)

* fix: get the series_number and curr_id with the correct self.current_idx, before the various incr/decr

* chore: remove unneeded conditional check (already done earlier), and add code comments

* Refine documentation in common_subexpr_eliminate.rs

* chore: cleanup -- fix 1 doc comment and consolidate common-expr-elimination test with other expr test

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
… not always stay in sync with the updated TreeNode traversal
…, while keeping the (stack-popped) symbol used for alias.
…expr_set, while keeping the (stack-popped) symbol used for alias."

This reverts commit 049bf09.
…ing does not always stay in sync with the updated TreeNode traversal"

This reverts commit d59a8de.
…re-find the correct expression during re-write. (apache#9871)

* test(9870): reproducer of error with jumping traversal patterns in common-expr-elimination traversals

* refactor: remove the IdArray ordered idx, since the idx ordering does not always stay in sync with the updated TreeNode traversal

* refactor: use the only reproducible key (expr_identifer) for expr_set, while keeping the (stack-popped) symbol used for alias.

* refactor: encapsulate most of the logic within ExprSet, and delineate the expr_identifier from the alias symbol

* test(9870): demonstrate that the sqllogictests are now passing
* fix: Remove supported coalesce types

* Use comparison_coercion

* Fix test

* Fix

* Add comment

* More

* fix
)

* fix CI clippy

* fix scalar size test

* fix tests

* fix tests
@wiedld wiedld changed the title Iox 10349/df upgrade with patches WIP: df patched upgrade to 2024-03-05 Apr 1, 2024
@wiedld wiedld marked this pull request as draft April 1, 2024 18:30
@wiedld
Copy link
Collaborator Author

wiedld commented Apr 10, 2024

Already have advanced to a stable 2024-03-12 branch. Closing.

@wiedld wiedld closed this Apr 10, 2024
@wiedld wiedld deleted the iox-10349/df-upgrade-with-patches branch June 21, 2024 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants