You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched in the issues and found no similar issues.
What would you like to be improved?
Currently, the optimizing plan evaluates all files first, then splits tasks, and performs file filtering after task splitting. This is not intuitive and can cause several issues, such as:
The evaluating result is different from that after split tasks, which causes the pending data to be different from the actual optimizing data
Bin-packing split task is inaccurate because some files will be filtered out later
Verifying which DataFile files were involved in the optimizing requires recalculating after splitting tasks, which results in performance issues
How should we improve?
We should filter files as early as possible: filtering should be done before splitting tasks and not be done again after.
Search before asking
What would you like to be improved?
Currently, the optimizing plan evaluates all files first, then splits tasks, and performs file filtering after task splitting. This is not intuitive and can cause several issues, such as:
How should we improve?
We should filter files as early as possible: filtering should be done before splitting tasks and not be done again after.
Are you willing to submit PR?
Subtasks
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: