-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Enhancement for information_schema."tables" #134
Comments
Thanks for raising the issue! We will be looking into it and get back to you if we have any questions. Thanks once again! |
Hi @Alain-Barrette, @ArgusLi, dremio is case insensitive with metadata (table, column names), but case sensitive with data (equality, like...). And on top of that dbt could issue warnings like : "When searching for a relation, dbt found an approximate match. Instead of guessing which relation to use, dbt will move on. Please delete database.schema.model, or rename it to be less ambiguous.". So my two cents advice, be very careful with catalog macros. I will be very pleased to help, if needed. Best regards, |
@Alain-Barrette I've started work on this new change. If you look at the associated branch, I have created a new variable
Enabling the variable then changes the ilike to a basic equality for the dremio__list_relations_without_caching macro when reflections are disabled (which is the default). I've done some basic testing but have not done a full test suite or investigated the consequences of enabling this variable. Please feel free to try it out and let me know if there are any bugs or anything else that needs to be changed. Note: Using this branch requires dbt-core 1.4.1. |
Thank you @ArgusLi , I have created a ticket on my side to have the team validate this. We will keep you informed of the result. |
@ArgusLi : not working when installing dbt-core 1.4.1 with dbt-dremio 1.3.2 on the CLI with Python 3.10. Could you update PyPi? |
@alexcotecbq In order to test this new change, you'll have to clone the repo, switch to the branch We only want to update pypi for each release, and this change will most likely be put into the 1.4 release, which should happen pretty soon, most likely within the month. There still needs to be more thorough testing before this change can be merged into the 1.4 release though. |
@ArgusLi : Okay this is working. |
@fabrice-etanchaud Could you please give more information or examples on the situations where ilike should be used? That way we can come up with tests and solutions that could make the connector more optimised. |
Performance without the change (149 seconds)
Performance with the new change (27 seconds):
Another stat from a colleague: Total Runtime of one sql query from 30s. to 222.30μs. |
@ArgusLi , This is much much faster now. @alexcotecbq , Thanks a lot for your implication. |
When we run the code to implement our dremio objects, the following code get run
The issue is with the where ilike(table_schema, 'Preparation.B2BV.Curated'). By applying a fonction (iLike) on the column, the performance is really bad. As you can see in the image,
data:image/s3,"s3://crabby-images/67d77/67d77bd3d26d1b895aa3cc4a0340498a64676b26" alt="image"
Replacing this ilike with an equality would resolve this issue
where table_schema = 'Preparation.B2BV.Curated'
This would cause dremio to be lightning fast as it should.
Thanks
The text was updated successfully, but these errors were encountered: