-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAP-569] [Enhancement] only wait for Dataproc job to finish, not for the Dataproc serverless cluster to spin down #734
Comments
Thanks for reporting this @wazi55! Are you interested in contributing a pull request for this, by any chance? |
I re-labeled this as an |
@wazi55 thanks so much for opening! This was on my to-do list to write up after an email thread last month with Big Query engineers. Their recommendation was to use afaict, this is not a drop-in replacement as some polling would have to be implemented to ensure that the response's Perusing the docs, I also notice that there is a I'm going to ask the GCP team to validate my thinking. |
Hello 👋 get_batch is exactly the method I was going to suggest, basically check the returned BATCH.STATE should be the way to go. |
resolved by #929 |
Is this a new bug in dbt-bigquery?
Current Behavior
When working with DBT with Dataproc, the current behavior is DBT waits for the batch job complete the torn down of the cluster here, however, that adds an additional 1 or 2 minutes on the computation time. Instead of waiting for the job finish, the code should just wait until the job is finished by polling using the get_batch() method to check whether the job has succeeded or not.
Expected Behavior
The expected behavior is, DBT should continue on with the SQL models after the spark code succeed instead of waiting for the cluster to be torn down.
Steps To Reproduce
Relevant log output
No response
Environment
Additional Context
No response
The text was updated successfully, but these errors were encountered: