-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use EF 8 openjson for inserts #34512
Comments
What kind of usage of OPENJSON are you thinking about for inserts, can you provide a sample? Why do you think that would be more efficient? Note that in any case, #27333 tracks introducing an efficient bulk insert API to EF, which would use each database driver's efficient bulk import mechanism. That's likely to be far more efficient than any JSON-based solution. |
With a json parameter and a OPENJSON in the sql there would be a single parameter and a single statement sent to sql. Today there is one insert sent with a bunch of parameters for each element in the insert. |
Do you have any benchmark or resource showing that to be faster? |
I did a quick test. Inserting 65k entitis using EF took 40-60 seconds., That code looks like var includedPayments = await _ctx.Set<PaymentTransaction>().Where(p => p.Balanced == false && p.PaymentState == PaymentState.PendingApproval && p.PaymentDirection == PaymentDirection.Outgoing).ToListAsync();
await _ctx.AddAsync(new PaymentTransactionSnapshot
{
Snapshots = includedPayments.Select(p => new Snapshot { Checksum = p.RowVersion, PaymentTransaction = p }).ToList()
}); For reference I tried to move the entire thing to a SP and did a insert into from a select on the same data which took 200ms I then did a quick POC like
Again same data and it took 5 seconds. So alot slower than a sp since the data is available from within the SP with a query. But 5 seconds is really good and about 8 percent of the time with EF Core insert. Also alot of times the data cant be queried from within a sp and have to be provided by the domain. edit: If I optimize the populating query it goes down to sub second timing var includedPayments = await _ctx.Set<PaymentTransaction>().Where(p => p.Balanced == false && p.PaymentState == PaymentState.PendingApproval && p.PaymentDirection == PaymentDirection.Outgoing)
.Select(p => new { p.Id, p.RowVersion}).ToListAsync();
var json = System.Text.Json.JsonSerializer.Serialize(includedPayments.Select(p => new { PaymentTransactionId = p.Id, p.RowVersion }));
var sql = @"INSERT INTO PaymentTransactionSnapshot (CreatedBy, CreatedUTC, Sent) VALUES ({0}, GETUTCDATE(), 0)
INSERT INTO PaymentTransactionSnapshotPaymentTransaction (Id, PaymentTransactionId, Checksum)
SELECT SCOPE_IDENTITY(), JSON_VALUE(value, '$.PaymentTransactionId'), CAST(JSON_VALUE(value, '$.RowVersion') as binary(8)) FROM
OPENJSON({1}, '$')";
await _ctx.ExecuteRaw(sql, _ctx.Username, json); |
@AndersMalmgren I'm quite confused by the above - you're comparing different things at different levels. First, for the EF code which takes 40-60 seconds for 65k entities, please submit an actual minimal repro, or at least a full sample. It's not clear what exactly you're measuring, whether you're calling SaveChanges() after each AddAsync, or at end, etc. It's simply impossible to know what exactly it is that you're measuring. Then, if you'd like to show that inserting something using OPENJSON is somehow faster than inserting without OPENJSON, you need to compare two comparable SQL insertions, and not higher-level EF Add/SaveChanges code with a lower-level raw SQL API. You also need to do this in a more serious way, by using BenchmarkDotNet (again, it's not clear what exactly you're measuring and how). In any case, I'm not sure why it makes sense to pack all your data in a JSON, and then do INSERT ... SELECT FROM OPENJSON() over that data; what exactly is the advantage there compared to a simpler INSERT with multiple VALUES: INSERT INTO PaymentTransactionSnapshotPaymentTransaction (Id, PaymentTransactionId, Checksum) VALUES (v1, v2, v3), (v4, v5, v6)) This is the traditional - and also more portable way to do what you're doing here; you'd need to at least provide a reliable benchmark between the two for this to make any sense... |
With EF 8 we can do really fast queries with large datasets like
This is superhelpful for many usecases.
But couldnt this also be used to pass alot of data for inserts
EF inserts are crazy slow compared to a sp with table value parameters.
If you could pass the insert data as json it almost defeats the need for storage procs
The text was updated successfully, but these errors were encountered: