Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQueryInsertJobOperator broken for some kinds of query jobs #24535

Closed
2 tasks done
quentin-sommer opened this issue Jun 17, 2022 · 7 comments
Closed
2 tasks done

BigQueryInsertJobOperator broken for some kinds of query jobs #24535

quentin-sommer opened this issue Jun 17, 2022 · 7 comments
Labels
area:providers kind:bug This is a clearly a bug

Comments

@quentin-sommer
Copy link

quentin-sommer commented Jun 17, 2022

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

I'm using a google composer managed version apache-airflow-providers-google==2022.5.18+composer but I see the issue in code in master as well

Apache Airflow version

2.2.5

Operating System

cloud composer

Deployment

Composer

What happened

Some of my BigQueryInsertJobOperator query jobs started failing following the update of composer which updated google provider from
apache-airflow-providers-google==6.3.0
to
apache-airflow-providers-google==2022.5.18+composer

I'm not sure what they are exactly using since the version is now hidden...

Although I was able to see a code path producing the bug in master.
It happens when BigQueryInsertJobOperator is used to build a query job and no destination table is given. It is allowed doc

object (TableReference)
Optional. Describes the table where the query results should be stored. This property must be set for large results that exceed the maximum response size. For queries that produce anonymous (cached) results, this field will be populated by BigQuery.

But in current master code added by this PR always expects desinationTable to be filled for a query job. The PR didn't introduce the bug though it was trying to fix the same bug from a previous PR. It just wrongly assumed that the field is always present.

Essentially #23826 has not been completely fixed.

I'm willing to send a PR but I basically discovered the concepts of links reading the code while debugging this and I'm not sure how to disable the link for query jobs without destination tables. I guess It could also link to the temporary table but that's not very useful as the temporary table gets cleaned quickly.

What you think should happen instead

The BigQueryInsertJobOperator operator should not crash when desinationTable is not filled for a query job

How to reproduce

from airflow.providers.google.cloud.operators.bigquery import BigQueryInsertJobOperator
BigQueryInsertJobOperator(
    task_id='id',
    configuration={
        "query": {
            "query": "SELECT 1",
            "useLegacySql": False,
        }
    }
)

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@quentin-sommer quentin-sommer added area:providers kind:bug This is a clearly a bug labels Jun 17, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Jun 17, 2022

Thanks for opening your first issue here! Be sure to follow the issue template!

@raphaelauv
Copy link
Contributor

the fix is in

apache-airflow-providers-google 8.0.0

@potiuk
Copy link
Member

potiuk commented Jun 19, 2022

CC: @lwyszomi (not sure @raphaelauv if it's fixed)

@raphaelauv
Copy link
Contributor

@potiuk -> #24289 (comment)

( I'm only speaking about 8.0.0 , not about current main branch )

@lwyszomi
Copy link
Contributor

In my PR this problem should be fixed, please check https://github.com/apache/airflow/pull/24416/files#diff-529929b4ca60ce73b8da0f45d8a5c43c2d4e391b913fe78b39892899f812951eR2171 we checking if destinationTable exist in the Job.configuration.query.

@eladkal
Copy link
Contributor

eladkal commented Jun 20, 2022

As explained issue is fixed please bump provider version to 8.0.0

@eladkal eladkal closed this as completed Jun 20, 2022
@quentin-sommer
Copy link
Author

Great thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

5 participants