Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use insert_job to execute gcs_to_bq_operator #24578

Closed
2 tasks done
TY-chang opened this issue Jun 21, 2022 · 3 comments
Closed
2 tasks done

use insert_job to execute gcs_to_bq_operator #24578

TY-chang opened this issue Jun 21, 2022 · 3 comments
Labels
duplicate Issue that is duplicated kind:feature Feature Requests

Comments

@TY-chang
Copy link

Description

  • I met LoadJob bug in BigQueryInsertJobOperator and tried to use GCSToBigQueryOperator instead when update from Airflow 2.2.5 to 2.3.2.

Use case/motivation

  • most BigQuery-related operator recommend that we should use insert_job(), but gcs_to_bq_operator still use run_load ().
  • GCSToBigQueryOperator still uses lots of parameter to build config. Not like BigQueryInsertJobOperator which uses a clean configuration. The configuration is much straightforward and easy to follow.
  • I would like to know whether to keep GCSToBigQueryOperator and replace run_load with insert_job, Or have BigQueryInsertJobOperator. If we could replace run_load with insert_job in GCSToBigQueryOperator , all BigQuery-related operator could be more consistent.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@TY-chang TY-chang added the kind:feature Feature Requests label Jun 21, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Jun 21, 2022

Thanks for opening your first issue here! Be sure to follow the issue template!

@TY-chang
Copy link
Author

Here is error I met when we updated to version 2.3.2:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/google/cloud/operators/bigquery.py", line 2170, in execute
    table = job.to_api_repr()["configuration"]["query"]["destinationTable"]
KeyError: 'query'

@eladkal
Copy link
Contributor

eladkal commented Jun 21, 2022

duplicate: #23826
Please update your google provider.

@eladkal eladkal closed this as not planned Won't fix, can't repro, duplicate, stale Jun 21, 2022
@eladkal eladkal added the duplicate Issue that is duplicated label Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate Issue that is duplicated kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

2 participants