-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MaxID logic for GCSToBigQueryOperator #26768
Conversation
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the caveat that I have absolutely no idea whether the old or new logic is correct, lacking knowledge on the API.
@uranusjr If you have access to GCP/BigQuery, this was the quick/simple script I used to test out all the logic first. https://cloud.google.com/bigquery/docs/samples/bigquery-query-results-dataframe (just removed the dataframe part, and played around with the iterator/row access) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. You can also apply the suggestion from @bhirsz .
I am going to merge that one after we release the other providers (and prepare rc2 of the google provider).
related: #26283
closes: #26767
The
max_id_key
parameter, when used, causes an XCom serialization failure when trying to retrieve the value back out of XCom. This is because instead of storing a single column value in XCom, we were accidentally storing the entire Row.The Unit Test was updated to reflect the return type of
get_job().result()
. This operation actually returns a Row iterator, but returning an array ofRow
works well for the test.