-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUPER SLOW dbt run -m model_name #556
Comments
This looks like the culprit. Thanks for the report. |
@ashwintastic can you try latest main and let me know if it improves the situation? |
Curious as to if you see any improvements here @ashwintastic, I've noticed a similar problem where it takes around 3 mins to run a 4 sec model. I've attempted to use 1.7.7 also and the issue is much the same sadly. |
One core issue is that dbt enumerates every table in the schema just in case. For Databricks, metadata operations like this are pretty slow. What I did could only improve performance if you are on hive rather than UC. |
Interesting. Yeah I am also using hive not UC but sadly didn't see any improvements. |
Sorry to hear. We have an on-going conversation with dbt about how we might fix this in the future. On a large project this start up cost may not be noticeable, but it can be very annoying if you want to update a single model :/. |
Hello guys! I'm very interested in this topic. We have a use case where we run every dbt model in an Airflow task, so for every run we need to wait 1-2 minutes in order to discover the catalog. I guess it should have a reason, but our models are very slow and expensive for that. Do you have any news on how this behavior is going to improve? Is this better with Unity Catalog? There are some other related discussions about it in dbt-spark dbt-labs/dbt-spark#228 (comment) |
I can see performance improvement using |
What is your |
It depends on whether you are using UC or HMS, and how many tables you have in your schemas. As mentioned above, I'm in on-going conversations with dbt on how to improve targeted model performance. @jtcohen6 for viz. |
Hey! Any luck improving this, especially for UC uses? If the metadata calls are expensive, could we hit something like information_schema to get this information instead? |
Describe the bug
A clear and concise description of what the bug is. What command did you run? What happened?
I’m running dbt run -m model_name , which is super slow when I checked the logs dbt is checking each and every tables, views in that schema by running show extended table.
Earlier in version 1.6 it was responding faster
Steps To Reproduce
In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.
dbt run -m model_name
Expected behavior
A clear and concise description of what you expected to happen.
It should pick the model asap, but it is checking for each table, views in that schema
describe table
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of
dbt --version
:The operating system you're using:
Wndows 11
The output of
python --version
:Python 3.10.6
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: