-
Notifications
You must be signed in to change notification settings - Fork 0
Aggregation pushdown #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-331
Are you sure you want to change the base?
Conversation
…ode and InternalPlanVisitor
Since the introduction of the applyXXX methods, a TableScan represents a subquery, not just a raw table. It is possible that such a subquery might have no columns due to a projection being applied by the optimizer.
|
This is an amazing job, looking forward to being merged to the trunk, we need this feature |
|
@skyahead, great job. Has this been fully tested? Can I integrate it into 332 release? |
|
@RugratsJ @gaojun2048 Thanks guys. The code in its current form can not be merged in to the master branch as it is not complete. It is my first attempt to implement the proposal here: https://github.com/prestosql/presto/wiki/Pushdown-of-complex-operations#aggregation-pushdown. At best, it has done 1/3 of the total work needed. In my opinion, doing the other 2/3 involve changes too many places in PrestoSQL internals and can not be done in a short while, but I can not wrong. Therefore, I have been using this 1/3 implemented code in our environment where most of our queries are hitting S3 files, and I am adding Druid support. If you guys are interested in trying the code in your work, here are the steps:
Note: using this code means when upstream is up versioned to 333, 334, etc, you have to redo all the codes changes and solve all the conflicts. And so, not a good idea. I am maintaining our Presto that is already diverged from the upstream, and so I can live with this 1/3 code for a while, and hoping the community can move faster on aggregation pushdown. To be honest, I am also thinking a lot these days of switching our cluster back to PrestoDB distributions, which has this feature done already. For JDBC, adding aggregation pushdown in prestoDB distributions can be at least 10 times easier than doing aggregation pushdown in the prestoSql distributions. Again, I might be wrong though. |
|
@skyahead, thank you for the detailed instructions. Since you are using AWS, instead of using druid, do you think glue + S3 will be a better choice? Has PrestoDB distributions already done the aggregation pushdown? It's still open prestodb/presto#4839 |
We are not using Druid to replace anything. Druid is one under storage for Presto and S3 is another. We do not use glus as we run our own Hive metastore. PrestoDB does have working aggregation pushdown, I run it in our staging env but not in production. I can pushdown aggregations to Druid. But PrestoDB's aggregation pushdown has NOT been implemented for any JDBC storages yet. If you want, I can help write and we can work together to write one. |
No description provided.