Use INSERT w/OUTPUT Instead of INSERT+SELECT #7188

bgribaudo · 2016-12-05T15:26:15Z

Entity Framework Core's SQL Server provider (at least, as of v. 1.1.0) follows each INSERT statement with a SELECT statement that fetches database-generated values and verifies the inserted row count.

Proposal

Eliminate the SELECT statement. Instead:

Fetch database-generated values using an OUTPUT clause on the INSERT statement.
Rely on the database provider to throw an exception of the INSERT fails instead of verifying inserted row count (looks like was supposed to have been implemented per Update: SQL Server: Don't Select @@ROWCOUNT after INSERT #2131).

Examples

Currently (EF Core 1.1.0)

-- Table w/identity column:
SET NOCOUNT ON;
INSERT INTO [TestTable1] ([Data])
VALUES (@p0);
SELECT [Id]
FROM [TestTable1]
WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();

-- Table w/identity and default value (`DEFAULT(GETDATE())`) columns:
SET NOCOUNT ON;
INSERT INTO [TestTable2] ([Data])
VALUES (@p0);
SELECT [Id], [HasDefaultValue]
FROM [TestTable2]
WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();

Proposed Simplification

-- Table w/identity column:
SET NOCOUNT ON;
INSERT INTO [TestTable1] ([Data])
    OUTPUT INSERTED.[Id]
VALUES (@p0);

-- Table w/identity and default value (`DEFAULT(GETDATE())`) columns:
SET NOCOUNT ON;
INSERT INTO [TestTable2] ([Data])
    OUTPUT INSERTED.[Id], INSERTED.[HasDefaultValue]
VALUES (@p0);

Example Code

Example.zip

Further technical details

EF Core version: 1.1.0
Operating system: Windows 10
Visual Studio version: VS 2015

The text was updated successfully, but these errors were encountered:

rowanmiller · 2016-12-12T21:29:07Z

We used to take this approach, but it prevents you from using triggers (see #1441 for a complete history). We've profiled many approaches to this, and the performance difference between them all is negligible (a couple of percent either way).

bgribaudo · 2016-12-13T18:41:41Z

Thanks, @rowanmiller . This makes sense. I forgot about triggers and OUTPUT being mutually incompatible. :-(

Coder3333 · 2019-08-29T14:02:47Z

I am getting heavy deadlocking issues from the 2 step approach, and this is when I am only doing 2 simultaneous inserts into a table. If EF were using the output clause to get the identity, instead of a separate query, I do not think I would have the deadlock issues. Would it make sense to add an option for developers to choose how the insert is handled for folks that are not concerned about triggers?

roji · 2022-02-04T12:57:44Z

Opening to revisit.

For tables without identity, we currently do INSERT ... OUTPUT INTO @tvp; SELECT ... FROM @tvp, since OUTPUT INTO doesn't have the restriction on triggers. However, it seems that this is quite bad for perf (see #27372).

Assuming we switch to using regular OUTPUT (without INTO) for non-IDENTITY tables, it may make sense to do the same here:

It would probably make the deadlock issues go away (though I haven't researched this)
It would simplify our implementation - we'd use the same logic regardless of IDENTITY.
It's about 4% faster:

BenchmarkDotNet=v0.13.0, OS=ubuntu 21.10
Intel Xeon W-2133 CPU 3.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.101
[Host] : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT
DefaultJob : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT

Method	Mean	Error	StdDev	Ratio	RatioSD
NoOutput	2.085 ms	0.0413 ms	0.0863 ms	1.04	0.06
Output	1.912 ms	0.0378 ms	0.0641 ms	0.96	0.04
OutputInto	5.337 ms	0.1041 ms	0.1651 ms	2.67	0.11
AdditionalQuery	2.002 ms	0.0391 ms	0.0561 ms	1.00	0.00

Benchmark code

BenchmarkRunner.Run<IdentityBenchmark>();

public class IdentityBenchmark
{
    const string ConnectionString = "Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false";
    private SqlConnection _connection;

    [GlobalSetup]
    public async Task Setup()
    {
        _connection = new SqlConnection(ConnectionString);
        await _connection.OpenAsync();

        await using var cmd = new SqlCommand(@"
DROP TABLE IF EXISTS [Foo];
CREATE TABLE [Foo] ([Id] INT IDENTITY PRIMARY KEY, [Bar] INT)", _connection);
        await cmd.ExecuteNonQueryAsync();
    }

    [Benchmark]
    public async Task NoOutput()
    {
        await using var cmd = new SqlCommand("INSERT INTO [Foo] ([Bar]) VALUES (8)", _connection);
        _ = await cmd.ExecuteScalarAsync();
    }

    [Benchmark]
    public async Task Output()
    {
        await using var cmd = new SqlCommand("INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] VALUES (8)", _connection);
        _ = await cmd.ExecuteScalarAsync();
    }

    [Benchmark]
    public async Task OutputInto()
    {
        await using var cmd = new SqlCommand(@"DECLARE @inserted TABLE ([Id] int);
INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] INTO @inserted VALUES (8);
SELECT [i].[Id] FROM @inserted i;", _connection);
        _ = await cmd.ExecuteScalarAsync();
    }

    [Benchmark(Baseline = true)]
    public async Task AdditionalQuery()
    {
        await using var cmd = new SqlCommand(@"INSERT INTO [Foo] ([Bar]) VALUES (8);
SELECT [Id] FROM [Foo] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity()", _connection);
        _ = await cmd.ExecuteScalarAsync();
    }

    [GlobalCleanup]
    public ValueTask Cleanup()
        => _connection.DisposeAsync();
}

GSPP · 2022-02-07T12:22:27Z

If I remember correctly, the MERGE statement has capabilities beyond the normal DML. I fail to recall what it was, though. So that's a potential avenue.

MERGE OUTPUT can reference the source table and output values from it. That can be used to input and get back out row IDs. Not sure if this could help, but I'm mentioning it.

MERGE should collapse to an essentially identical plan to normal DML if there's only one kind of write clause.

roji · 2022-02-09T10:35:45Z

@GSPP yeah, we already use MERGE in many situations when inserting (specifically when we batch multiple insertions). I'm investigating switching to it entirely.

ErikEJ · 2022-02-09T10:44:05Z

@roji Be careful out there: https://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/

roji · 2022-02-09T10:49:12Z

@ErikEJ I just read that post a couple days ago, when I found out that MERGE indeed isn't atomic... That's indeed a problem for using it to implement UPSERT.

However, in this context, note that EF Core already uses MERGE for insertions; I'd only be proposing changing when we use it (i.e. always instead of over 4), and our use of OUTPUT vs. OUTPUT INTO clauses. Will post a summary soon on #27372 and would very much appreciate any feedback!

GSPP · 2022-02-19T11:29:46Z

@roji I didn't realize that MERGE is so much in the radar. That's great to know!

With respect to atomicity: Merge is no different than any other SQL Server DML statement.

Merge is implemented in the following way: For each target row affected, a modification is computed. That modification is then applied, be it insert, update or delete. There really is no substantial difference to, say, an insert statement.

The updates are computed by first full-joining source and target, then by computing (using normal SQL logic) the modification type and parameters. All this is visible in the plan.

That full join and the logic are subject to normal optimizer behavior. This is why that full join can collapse to a simpler join type and then be optimized further.

I believe that not all of this machinery can "collapse". This might cause merge to remain a bit slower. But there should be no fundamental difference in behavior or performance.

That's indeed a problem for using it to implement UPSERT.

Could you elaborate on where the problem lies? MERGE is exactly as atomic as any other DML statement (on SQL Server).

roji · 2022-02-19T19:07:26Z

@GSPP note that MERGE is already used today when inserting (and has been for a long time); this proposal (tracked mostly in #27372) simply extends using it in additonal scenarios. See #27372 (comment) for what we plan to change for 7.0. I've also done some extensive benchmarking comparing different techniques for inserting rows (including several variations with merge). Any further input you have here would be appreciated.

That's indeed a problem for using it to implement UPSERT.

Could you elaborate on where the problem lies? MERGE is exactly as atomic as any other DML statement (on SQL Server).

That is not my understanding... See this comment with links on the UPSERT issue - I'd be happy to continue the conversation there.

But as far as I'm aware, the atomicity issue isn't really relevant for our usage of MERGE for pure insertion (as opposed to UPSERT).

roji · 2022-02-19T19:07:57Z

Closing this issue in favor of #27372, which tracks all the changes we intend to do for SQL Server insertion for 7.0.

rowanmiller closed this as completed Dec 12, 2016

rowanmiller added the closed-by-design label Dec 12, 2016

Metritutus mentioned this issue May 22, 2018

EF Core database insert fails when table has a SQL INSTEAD OF trigger #12064

Closed

ajcvickers mentioned this issue Jul 30, 2021

Frequent deadlocks on concurrent inserts with identity columns #25345

Closed

ajcvickers mentioned this issue Feb 4, 2022

SQL Server: Optimize SQL Server OUTPUT clause usage when retrieving database-generated values #27372

Closed

roji reopened this Feb 4, 2022

AndriySvyryd added needs-design area-perf area-save-changes type-enhancement and removed closed-by-design labels Feb 10, 2022

AndriySvyryd assigned roji Feb 15, 2022

AndriySvyryd added this to the 7.0.0 milestone Feb 15, 2022

roji closed this as completed Feb 19, 2022

roji removed the needs-design label Feb 19, 2022

roji removed this from the 7.0.0 milestone Feb 19, 2022

ajcvickers removed type-enhancement area-perf area-save-changes labels Feb 28, 2022

ajcvickers added customer-reported closed-duplicate labels Feb 28, 2022

ajcvickers reopened this Oct 16, 2022

ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use INSERT w/OUTPUT Instead of INSERT+SELECT #7188

Use INSERT w/OUTPUT Instead of INSERT+SELECT #7188

bgribaudo commented Dec 5, 2016 •

edited

Loading

rowanmiller commented Dec 12, 2016

bgribaudo commented Dec 13, 2016

Coder3333 commented Aug 29, 2019

roji commented Feb 4, 2022 •

edited

Loading

GSPP commented Feb 7, 2022 •

edited

Loading

roji commented Feb 9, 2022

ErikEJ commented Feb 9, 2022

roji commented Feb 9, 2022

GSPP commented Feb 19, 2022 •

edited

Loading

roji commented Feb 19, 2022

roji commented Feb 19, 2022

Use INSERT w/OUTPUT Instead of INSERT+SELECT #7188

Use INSERT w/OUTPUT Instead of INSERT+SELECT #7188

Comments

bgribaudo commented Dec 5, 2016 • edited Loading

Proposal

Examples

Currently (EF Core 1.1.0)

Proposed Simplification

Example Code

Further technical details

rowanmiller commented Dec 12, 2016

bgribaudo commented Dec 13, 2016

Coder3333 commented Aug 29, 2019

roji commented Feb 4, 2022 • edited Loading

GSPP commented Feb 7, 2022 • edited Loading

roji commented Feb 9, 2022

ErikEJ commented Feb 9, 2022

roji commented Feb 9, 2022

GSPP commented Feb 19, 2022 • edited Loading

roji commented Feb 19, 2022

roji commented Feb 19, 2022

bgribaudo commented Dec 5, 2016 •

edited

Loading

roji commented Feb 4, 2022 •

edited

Loading

GSPP commented Feb 7, 2022 •

edited

Loading

GSPP commented Feb 19, 2022 •

edited

Loading