[ADAP-569] [Enhancement] only wait for Dataproc job to finish, not for the Dataproc serverless cluster to spin down #734

wazi55 · 2023-05-22T21:28:14Z

Is this a new bug in dbt-bigquery?

I believe this is a new bug in dbt-bigquery
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When working with DBT with Dataproc, the current behavior is DBT waits for the batch job complete the torn down of the cluster here, however, that adds an additional 1 or 2 minutes on the computation time. Instead of waiting for the job finish, the code should just wait until the job is finished by polling using the get_batch() method to check whether the job has succeeded or not.

Expected Behavior

The expected behavior is, DBT should continue on with the SQL models after the spark code succeed instead of waiting for the cluster to be torn down.

Steps To Reproduce

    def _submit_dataproc_job(self) -> dataproc_v1.types.jobs.Job:
        batch = self._configure_batch()
        parent = f"projects/{self.credential.execution_project}/locations/{self.credential.dataproc_region}"

        request = dataproc_v1.CreateBatchRequest(
            parent=parent,
            batch=batch,
        )
        # make the request
        operation = self.job_client.create_batch(request=request)  # type: ignore
        # this takes quite a while, waiting on GCP response to resolve
        # (not a google-api-core issue, more likely a dataproc serverless issue)
        response = operation.result(retry=self.retry)
        return response

Relevant log output

No response

Environment

- OS:
- Python:
- dbt-core:
- dbt-bigquery:

Additional Context

No response

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2023-05-22T23:53:26Z

Thanks for reporting this @wazi55!

Are you interested in contributing a pull request for this, by any chance?

dbeatty10 · 2023-05-22T23:56:11Z

I re-labeled this as an enhancement since I don't perceive there to be an error, flaw, failure or fault here -- this is more of an efficiency / optimization thing.

dataders · 2023-06-08T17:11:50Z

@wazi55 thanks so much for opening! This was on my to-do list to write up after an email thread last month with Big Query engineers.

Their recommendation was to use BatchControllerClient's .get_batch() instead of create_batch() to create DataProc jobs (relevant Google Python SDK docs).

afaict, this is not a drop-in replacement as some polling would have to be implemented to ensure that the response's state attribute (docs) is one of: SUCCEEDED, CANCELLED, or FAILED.

Perusing the docs, I also notice that there is a BatchControllerAsyncClient (docs), which might obviate the need for polling? I'm no expert here.

I'm going to ask the GCP team to validate my thinking.

wazi55 · 2023-07-10T16:54:42Z

Hello 👋 get_batch is exactly the method I was going to suggest, basically check the returned BATCH.STATE should be the way to go.

dataders · 2023-10-09T16:41:15Z

resolved by #929

wazi55 added bug Something isn't working triage labels May 22, 2023

dbeatty10 added enhancement New feature or request awaiting_response and removed bug Something isn't working triage labels May 22, 2023

github-actions bot added triage and removed awaiting_response labels May 22, 2023

dbeatty10 added awaiting_response and removed triage labels May 22, 2023

github-actions bot added triage and removed awaiting_response labels Jun 8, 2023

dataders added awaiting_response and removed triage labels Jun 8, 2023

ChenyuLInx mentioned this issue Jun 20, 2023

Pass python model timeout to polling instead of retry #766

Merged

6 tasks

github-actions bot added triage and removed awaiting_response labels Jul 10, 2023

This was referenced Jul 21, 2023

738/wazi #840

Closed

poll .GetBatch() instead of using operation.result() #841

Closed

dbeatty10 mentioned this issue Oct 9, 2023

poll .GetBatch() instead of using operation.result() #929

Merged

4 tasks

dbeatty10 closed this as completed Oct 9, 2023

dbeatty10 removed the triage label Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADAP-569] [Enhancement] only wait for Dataproc job to finish, not for the Dataproc serverless cluster to spin down #734

[ADAP-569] [Enhancement] only wait for Dataproc job to finish, not for the Dataproc serverless cluster to spin down #734

wazi55 commented May 22, 2023

dbeatty10 commented May 22, 2023

dbeatty10 commented May 22, 2023

dataders commented Jun 8, 2023

wazi55 commented Jul 10, 2023

dataders commented Oct 9, 2023

[ADAP-569] [Enhancement] only wait for Dataproc job to finish, not for the Dataproc serverless cluster to spin down #734

[ADAP-569] [Enhancement] only wait for Dataproc job to finish, not for the Dataproc serverless cluster to spin down #734

Comments

wazi55 commented May 22, 2023

Is this a new bug in dbt-bigquery?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Additional Context

dbeatty10 commented May 22, 2023

dbeatty10 commented May 22, 2023

dataders commented Jun 8, 2023

wazi55 commented Jul 10, 2023

dataders commented Oct 9, 2023