-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bigquery: remove automatic insertId
generation & allow specifying raw format
#1068
bigquery: remove automatic insertId
generation & allow specifying raw format
#1068
Conversation
We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm. |
@stephenplusplus I think that we have to leave autogenerated insertID untouched s.t. old apps won't break. |
Okay Google, contribute my commits. |
Okay Google, I agree to contribute my commits. |
Google set cla yes |
We will just bump the minor version to follow semver rules for the breaking change (pre-1.0, a minor bump is equivalent to a post-1.0 major bump). |
Thanks @vladmiller! |
@stephenplusplus any estimate on when package becomes available in npm? |
We're due for a release soon. Maybe I can get one out this week. In the meantime, you can use master: $ npm install --save googlecloudplatform/gcloud-node |
@stephenplusplus Thank you! |
@stephenplusplus do you know when this PR will be merged? |
Just needs a review from @callmehiphop. This should work if you want to install from my branch for now: $ npm install --save stephenplusplus/gcloud-node#vlad--patch-1 |
@stephenplusplus Looks good to me! |
bigquery: remove automatic `insertId` generation & allow specifying raw format
@stephenplusplus I'm confused by this one. Why can't we allow control of the insert ID via overriding without keeping the auto-generation? The request in #1066 was about being able to manually specify an insert ID, so adding the The request in #1041 is about adding multiple of the same rows in parallel (aka, same timestamp, same data), so I can see how a hash wouldn't make sense there. What we're trying to control for is:
I'd argue that this is not for de-duping your data (aka, only inserting unique rows), and it is not for recovering after a client-side failure, but only about server errors and auto-retries. If you want uniqueness constraints or transactions to avoid client-side problems, you should use a different storage system or de-duplicate, save to GCS, and bulk load the data afterwards. Can we re-open the discussion about using a UUID-1 (or similar) to generate insert ID values when none are provided (and which are only used when we automatically retry a failed request) ? |
Sure, let's move the convo back to the now re-opened #1041. |
Closes #1066
Breaking change included!
insertId
is no longer defaulted to a value.@vladmiller - Please take a look. I just wanted to change a few things (mostly style things) and also remove the default
insertId
generation. The CLA bot will say something about confirming you are the original author of this code. If you're okay with it, just leave a note that it's okay.