-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use INSERT w/OUTPUT Instead of INSERT+SELECT #7188
Comments
We used to take this approach, but it prevents you from using triggers (see #1441 for a complete history). We've profiled many approaches to this, and the performance difference between them all is negligible (a couple of percent either way). |
Thanks, @rowanmiller . This makes sense. I forgot about triggers and OUTPUT being mutually incompatible. :-( |
I am getting heavy deadlocking issues from the 2 step approach, and this is when I am only doing 2 simultaneous inserts into a table. If EF were using the output clause to get the identity, instead of a separate query, I do not think I would have the deadlock issues. Would it make sense to add an option for developers to choose how the insert is handled for folks that are not concerned about triggers? |
Opening to revisit. For tables without identity, we currently do Assuming we switch to using regular
BenchmarkDotNet=v0.13.0, OS=ubuntu 21.10
Benchmark codeBenchmarkRunner.Run<IdentityBenchmark>();
public class IdentityBenchmark
{
const string ConnectionString = "Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false";
private SqlConnection _connection;
[GlobalSetup]
public async Task Setup()
{
_connection = new SqlConnection(ConnectionString);
await _connection.OpenAsync();
await using var cmd = new SqlCommand(@"
DROP TABLE IF EXISTS [Foo];
CREATE TABLE [Foo] ([Id] INT IDENTITY PRIMARY KEY, [Bar] INT)", _connection);
await cmd.ExecuteNonQueryAsync();
}
[Benchmark]
public async Task NoOutput()
{
await using var cmd = new SqlCommand("INSERT INTO [Foo] ([Bar]) VALUES (8)", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark]
public async Task Output()
{
await using var cmd = new SqlCommand("INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] VALUES (8)", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark]
public async Task OutputInto()
{
await using var cmd = new SqlCommand(@"DECLARE @inserted TABLE ([Id] int);
INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] INTO @inserted VALUES (8);
SELECT [i].[Id] FROM @inserted i;", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark(Baseline = true)]
public async Task AdditionalQuery()
{
await using var cmd = new SqlCommand(@"INSERT INTO [Foo] ([Bar]) VALUES (8);
SELECT [Id] FROM [Foo] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity()", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[GlobalCleanup]
public ValueTask Cleanup()
=> _connection.DisposeAsync();
} |
If I remember correctly, the
|
@GSPP yeah, we already use MERGE in many situations when inserting (specifically when we batch multiple insertions). I'm investigating switching to it entirely. |
@ErikEJ I just read that post a couple days ago, when I found out that MERGE indeed isn't atomic... That's indeed a problem for using it to implement UPSERT. However, in this context, note that EF Core already uses MERGE for insertions; I'd only be proposing changing when we use it (i.e. always instead of over 4), and our use of OUTPUT vs. OUTPUT INTO clauses. Will post a summary soon on #27372 and would very much appreciate any feedback! |
@roji I didn't realize that With respect to atomicity: Merge is no different than any other SQL Server DML statement. Merge is implemented in the following way: For each target row affected, a modification is computed. That modification is then applied, be it insert, update or delete. There really is no substantial difference to, say, an insert statement. The updates are computed by first full-joining source and target, then by computing (using normal SQL logic) the modification type and parameters. All this is visible in the plan. That full join and the logic are subject to normal optimizer behavior. This is why that full join can collapse to a simpler join type and then be optimized further. I believe that not all of this machinery can "collapse". This might cause merge to remain a bit slower. But there should be no fundamental difference in behavior or performance.
Could you elaborate on where the problem lies? |
@GSPP note that MERGE is already used today when inserting (and has been for a long time); this proposal (tracked mostly in #27372) simply extends using it in additonal scenarios. See #27372 (comment) for what we plan to change for 7.0. I've also done some extensive benchmarking comparing different techniques for inserting rows (including several variations with merge). Any further input you have here would be appreciated.
That is not my understanding... See this comment with links on the UPSERT issue - I'd be happy to continue the conversation there. But as far as I'm aware, the atomicity issue isn't really relevant for our usage of MERGE for pure insertion (as opposed to UPSERT). |
Closing this issue in favor of #27372, which tracks all the changes we intend to do for SQL Server insertion for 7.0. |
Entity Framework Core's SQL Server provider (at least, as of v. 1.1.0) follows each INSERT statement with a SELECT statement that fetches database-generated values and verifies the inserted row count.
Proposal
Eliminate the SELECT statement. Instead:
Examples
Currently (EF Core 1.1.0)
Proposed Simplification
Example Code
Example.zip
Further technical details
EF Core version: 1.1.0
Operating system: Windows 10
Visual Studio version: VS 2015
The text was updated successfully, but these errors were encountered: