Add support for parquetOptions in GCSToBigQueryOperator #60876
Merged
shahar1 merged 5 commits intoapache:mainfrom Jan 31, 2026
Merged
Add support for parquetOptions in GCSToBigQueryOperator #60876shahar1 merged 5 commits intoapache:mainfrom
parquetOptions in GCSToBigQueryOperator #60876shahar1 merged 5 commits intoapache:mainfrom
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
216e83e to
4d07633
Compare
mlauter
commented
Jan 21, 2026
providers/google/src/airflow/providers/google/cloud/transfers/gcs_to_bigquery.py
Show resolved
Hide resolved
4d07633 to
649a54a
Compare
649a54a to
603a232
Compare
Contributor
shahar1
approved these changes
Jan 24, 2026
Contributor
There was a problem hiding this comment.
Overall LGTM - small comment regarding the newsfragment file, and once you fix it I'll wait a couple of more days to let Google team to review it before merging.
As you tested the changes on a real BigQuery instance and changes are non-breaking, I feel comfortable merging it.
|
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
morelgeorge
pushed a commit
to morelgeorge/airflow
that referenced
this pull request
Feb 1, 2026
shashbha14
pushed a commit
to shashbha14/airflow
that referenced
this pull request
Feb 2, 2026
jason810496
pushed a commit
to abhijeets25012-tech/airflow
that referenced
this pull request
Feb 3, 2026
jhgoebbert
pushed a commit
to jhgoebbert/airflow_Owen-CH-Leung
that referenced
this pull request
Feb 8, 2026
81 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add support for
parquetOptionsin GCSToBigQueryOperatorsrc_fmt_configs.My team has a lot of workflows that involve loading parquet in gcs to bigquery using airflow. By default (without parquetOptions.enable_list_inference) parquet lists get loaded to bigquery as
STRUCT<list ARRAY<STRUCT<element $TYPE>>>. This nested struct is cumbersome to use for querying and analysis.With the
enable_list_inferenceflag, the same parquet list is loaded simply asARRAY<$TYPE>which is much easier to work with.This PR adds support for passing
enableListInferenceas one of the options insrc_fmt_configswhen the source format isPARQUET. This works both for the external table code path as well as the bq load code path.Testing
providers/google/tests/system/google/cloud/gcs/example_gcs_to_bigquery.py(haven't managed to get this running yet)Without
enableListInference:Produces:

With
enableListInference:produces:
And likewise for the external table case which i also tested.
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Sonnet 4.5 following the guidelines
GenAI tooling was used only for code review and discussion, no lines of code in the PR were written or directly copied from Claude.
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.