Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring parquet filter pushdown dynamically #3821

Closed
Tracked by #4349
alamb opened this issue Oct 13, 2022 · 1 comment · Fixed by #4427
Closed
Tracked by #4349

Allow configuring parquet filter pushdown dynamically #3821

alamb opened this issue Oct 13, 2022 · 1 comment · Fixed by #4427
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Oct 13, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I want to test out the parquet filter pushdown on real datasets using datafusion-cli so we can enable it by default -- #3463

To do so I want to both understand what the current setting of the pushdown flags are as well as change them statement by statement.

Describe the solution you'd like
Use the ConfigOptions to control parquet scanning options rather than anothr structure

Among other things that will allow the parquet settings to appear here as well as be controlled by environment variable in datafusion-cli.

It will also allow this feature to be turned off if we find an issue

❯ show all;
+-------------------------------------------------+---------+
| name                                            | setting |
+-------------------------------------------------+---------+
| datafusion.execution.time_zone                  | UTC     |
| datafusion.optimizer.skip_failed_rules          | true    |
| datafusion.explain.logical_plan_only            | false   |
| datafusion.optimizer.filter_null_join_keys      | false   |
| datafusion.explain.physical_plan_only           | false   |
| datafusion.execution.batch_size                 | 8192    |
| datafusion.execution.coalesce_batches           | true    |
| datafusion.execution.coalesce_target_batch_size | 4096    |
+-------------------------------------------------+---------+

Describe alternatives you've considered
Can just do it programmatically

Additional context
See #3463

@alamb alamb added the enhancement New feature or request label Oct 13, 2022
@alamb alamb self-assigned this Oct 13, 2022
@alamb
Copy link
Contributor Author

alamb commented Oct 13, 2022

FYI @Ted-Jiang @thinkharderdev and @tustvold -- I should have a PR up with this shortly (it rejiggers the config stuff to be more dynamic)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant