Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a check in BigQueryInsertJobOperator to verify the Job state before marking it as success #40863

Closed
wants to merge 2 commits into from

Conversation

kandharvishnu
Copy link
Contributor

@kandharvishnu kandharvishnu commented Jul 18, 2024

Closes: 40839

Occasionally, Airflow marks the BigQueryInsertJobOperator task as successful before it has actually completed in BigQuery. To resolve this issue, we will add a condition to check the job's state and wait until it is successfully completed in BigQuery.


@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Jul 18, 2024
@RNHTTR RNHTTR requested a review from potiuk July 22, 2024 14:19
Comment on lines 3025 to 3028
if job.state in ("PENDING", "RUNNING"):
import time

time.sleep(5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add a sleep here. Why not do...

while job.state in ("PENDING", "RUNNING"):
...

Copy link
Contributor

@shahar1 shahar1 Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RNHTTR @kandharvishnu In a second thought, maybe it's better to have a time.sleep() - otherwise, if it is pending or running for long time - requests will be sent immedately one after another, which might cause exceeding API rate limit.
Example from another Google providers:

What do you guys think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajastro , fyi for #44279

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we are looping, then sleep makes sense.

@eladkal eladkal requested a review from shahar1 July 28, 2024 05:35
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Sep 16, 2024
@github-actions github-actions bot closed this Sep 24, 2024
@rawwar rawwar reopened this Nov 20, 2024
@kandharvishnu kandharvishnu requested a review from RNHTTR November 22, 2024 06:50
@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 26, 2024
@eladkal
Copy link
Contributor

eladkal commented Dec 1, 2024

Suppressed by #44279

@eladkal eladkal closed this Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a check in BigQueryInsertJobOperator to verify the Job state before marking it as success in Airflow
6 participants