-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Breaking: Rename old InsertMany, add new version with return values #589
Conversation
// Job uniqueness is not respected when using InsertMany due to unique inserts | ||
// using an internal transaction and advisory lock that might lead to | ||
// significant lock contention. Insert unique jobs using Insert instead. | ||
func (c *Client[TTx]) InsertMany(ctx context.Context, params []InsertManyParams) ([]*rivertype.JobInsertResult, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a thought that maybe instead of JobInsertResult
being a plain struct we could consider using an interface. That would allow for expandability, particularly if we override these on the Pro side to return additional stuff at some point.
Need to think more about concrete use cases for that though.
client.go
Outdated
} | ||
|
||
if param.InsertOpts != nil { | ||
// UniqueOpts aren't support for batch inserts because they use PG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some ideas on making this work for multi insert but it'd probably be better for a video call brainstorming session.
Just a couple points on this one:
river/riverdriver/riverpgxv5/internal/dbsqlc/river_job.sql Lines 226 to 251 in 7d98386
|
Yes, it's to make it so that jobs added with There are lots of use cases where people want to log or record the job ID after it's been inserted so that they can track it, monitor it, or otherwise interact with the job programmatically after insertion.
I gave this The workaround is to pass an array of strings, and then cast it to I also needed to keep the workaround for
Maybe I'm missing something, but I'm not seeing the security liability aspect of this. This code The code maintenance thing is a much bigger concern, so I'm more than willing to try the unnest approach for now to see if we run into problems. We can always introduce more complex code as an optimization later on if needed. ~Here is that branch if you want to compare` The PR has been updated to use this approach.
I did consider this, and landed on this approach for these reasons:
I'm still open to changing the approach, but felt that the benefits pretty strongly outweighed the downsides here. |
ba2bccd
to
772eba5
Compare
This new query allows many rows to be inserted (up to ~7280 as of now) while also allowing the new records to be returned. While this is not quite as fast as the `COPY FROM` option when loading many rows, it provides a better UX for the vast majority of use cases. It _does_ require that we ditch sqlc for this one query, because sqlc does not support the multirow values insert syntax due to the dynamic nature of the param placeholders.
b79871e
to
b5764d0
Compare
b5764d0
to
73264a9
Compare
This adds new implementations of `InsertMany` / `InsertManyTx` that use the multirow `VALUES` syntax to allow the new rows to be returned upon insert. The alternative `COPY FROM ` implementation has been renamed to `InsertManyFast` / `InsertManyFastTx`. The expectation is that these forms will only be needed in cases where an extremely large number of records is being inserted simultaneously, whereas the new form is more user-friendly for the vast majority of other cases.
73264a9
to
6214038
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Thanks for those changes!
|
||
defaultObject := "{}" | ||
|
||
insertJobsParams.Args[i] = valutil.ValOrDefault(string(params.EncodedArgs), defaultObject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should maybe pull this trick into the other JobInsertMany
in here too.
@@ -118,6 +118,7 @@ type Executor interface { | |||
JobInsertFast(ctx context.Context, params *JobInsertFastParams) (*rivertype.JobRow, error) | |||
JobInsertFastMany(ctx context.Context, params []*JobInsertFastParams) (int, error) | |||
JobInsertFull(ctx context.Context, params *JobInsertFullParams) (*rivertype.JobRow, error) | |||
JobInsertManyReturning(ctx context.Context, params []*JobInsertFastParams) ([]*rivertype.JobRow, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one's still kind of a "job insert fast" variant, so that should probably still be in there.
I wonder too if we should try to match up which one gets the suffix between the client and driver a little bit more. In the client, the "fast" version gets a suffix, but in the driver here the "returning" variant gets the suffix (i.e. so they're flipped).
Maybe:
JobInsertFastMany
(called fromInsertManyFast
) ->JobInsertFastManyNoReturning
JobInsertManyReturning
(called fromInsertMany
) ->JobInsertFastMany
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, renamed them as suggested ✌️
…iverqueue#589) * add JobInsertManyReturning to drivers This new query allows many rows to be inserted (up to ~7280 as of now) while also allowing the new records to be returned. While this is not quite as fast as the `COPY FROM` option when loading many rows, it provides a better UX for the vast majority of use cases. It _does_ require that we ditch sqlc for this one query, because sqlc does not support the multirow values insert syntax due to the dynamic nature of the param placeholders. * rename InsertMany to InsertManyFast * new InsertMany/InsertManyTx with return values This adds new implementations of `InsertMany` / `InsertManyTx` that use the multirow `VALUES` syntax to allow the new rows to be returned upon insert. The alternative `COPY FROM ` implementation has been renamed to `InsertManyFast` / `InsertManyFastTx`. The expectation is that these forms will only be needed in cases where an extremely large number of records is being inserted simultaneously, whereas the new form is more user-friendly for the vast majority of other cases. * renames
Responding to some customer feedback, this PR reimplements
InsertMany
/InsertManyTx
using a new query withRETURNING
so the inserted rows are returned. The oldCOPY FROM
methods are retained for extremely high-volume use cases which do not care about return values, but are renamed toInsertManyFast
andInsertManyFastTx
. While these new methods are likely to be slower, for most use cases it won't matter and the added usability-by-default of this method is worthwhile IMO.While this was initially implemented using a multirow values syntax and associated query builder, the current approach uses
unnest
along with array parameters to enable usingRETURNING
with an insert query nearly the same as what we already had for thedatabase/sql
driver.This is a breaking change. However, my expectation is that most users were not checking the
count
value and as such will not have to change anything when they upgrade.Pinging @krhubert in case he wants to give some feedback on this
TODO: