-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate schema discovery queries that happen at startup #2127
Comments
drewbanin
changed the title
Deeply evaluate schema discovery queries that happen at startup, decide which ones don't need to happen
Evaluate schema discovery queries that happen at startup
Feb 20, 2020
There are really only two metadata queries that occur before a run starts on snowflake:
dbt also makes a number of unnecessary |
4 tasks
beckjake
added a commit
that referenced
this issue
Feb 27, 2020
…tartup Use threadpools for filling the cache and listing schemas (#2127)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
dbt runs a bunch of introspective queries at startup:
On some databases (postgres/redshift) these information schemas are pretty quick. On other databases (snowflake, spark) these queries can (and do) run for 10s of seconds! This means that the typical flow for running dbt looks like:
These queries are unavoidable. We experimented with using faster alternatives to the information schema (#1877) and ultimately, we must reckon with the reality that hitting the information schema can be slow. The next best thing that we can do is understand and optimize when these queries happen in order to provide the best possible experience for users running dbt.
The text was updated successfully, but these errors were encountered: