-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL Server: When retrieving generated columns in insert, concurrency may cause us not to get the original generated value #27446
Comments
Design decision: we will move away from doing INSERT+SELECT. See #27372 for further decisions on insertion behavior. |
On SQL Server, this should not be a problem because the The switch to another strategy might still be wise, just mentioning it. |
@GSPP you're right... In addition, any other transaction wouldn't even see the newly-inserted row before its SELECT completes, since the INSERT+SELECT hasn't been committed yet (unless that other transaction is using READ UNCOMMITTED). In fact, I could arrive at a deadlock by doing INSERT+SELECT in one transaction and UPDATE in another - even in Read Commit Snapshot Isolation mode, which is somewhat odd. So I agree that there isn't a correctness issue here, and if we want to keep perf optimized for all scenarios (including trigger-enabled ones), we may want to keep this technique. @GSPP any other reason you see to switch away from this? Note: in the context of avoiding transactions around single updates (#27439), this is a good example of a single modification command which is executed via 2 SQL commands, which do need to be wrapped in a transaction. /cc @lauxjpn |
How do you see that deadlock happening? I can't see it. Using a single DML statement that outputs the data that the client needs feels right to me. It can't get better than this. The only downside is that To me, these idiosyncrasies in SQL Server with the various strategies seem a bit unnecessary. They seem like unforced design errors in SQL Server. We have to make do with what we've got. |
I didn't give enough details above; I could reproduce a deadlock when the INSERT+SELECT transaction waits between the two commands on an unrelated UPDATE. So INSERT acquires the lock as you wrote above, then a separate UPDATE transaction blocks on that lock, as expected. The only weirdness IMHO is that this also happens in RSCI mode, where the separate UPDATE transaction isn't even supposed to see the inserted row, and so not block on it (this is the PG behavior). But I'm not SQL Server expert, so any insights would be appreciated. Here's a code sample: Deadlock INSERT+UPDATEawait using var conn1 = new SqlConnection("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false");
await conn1.OpenAsync();
await using var conn2 = new SqlConnection("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false");
await conn2.OpenAsync();
await using var cmd1 = new SqlCommand();
cmd1.Connection = conn1;
cmd1.CommandText = @"
ALTER DATABASE test SET READ_COMMITTED_SNAPSHOT ON;
DROP TABLE IF EXISTS Foo;
CREATE TABLE Foo (Id INT IDENTITY PRIMARY KEY, Bar INT)";
await cmd1.ExecuteNonQueryAsync();
await using var tx = (SqlTransaction)await conn1.BeginTransactionAsync();
cmd1.Transaction = tx;
cmd1.CommandText = @"INSERT INTO Foo (Bar) VALUES (8)";
await cmd1.ExecuteNonQueryAsync();
await using var cmd2 = new SqlCommand("UPDATE Foo SET Bar = 1000", conn2);
await cmd2.ExecuteNonQueryAsync();
cmd1.CommandText = @"SELECT [Bar] FROM [Foo] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity()";
Console.WriteLine(await cmd1.ExecuteScalarAsync());
I certainly agree, and what you write is true - except if there happen to be triggers on the table. This is because you can't use |
Definitely. FWIW none of this kind of complexity is present in PG, though I admit I haven't done an exhaustive perf analysis there yet. |
There are two snapshot related isolation levels for SQL Server:
Both modes come at the cost of a bit of SI provides real lock-free concurrent operations for reads and writes, but comes with the risk of failing statements that modify the same row. For more information, see:
With the write lock in mind for RCSI, the code sample would need to be change to something like the following to work as expected: Program.cs: RCSIusing System.Diagnostics;
using Microsoft.Data.SqlClient;
namespace MssqlSnapshotIsolation01;
internal static class Program
{
private static async Task Main()
{
await using (var conn0 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False"))
{
await conn0.OpenAsync();
await using var cmd0 = new SqlCommand();
cmd0.Connection = conn0;
cmd0.CommandText = @"
DROP DATABASE IF EXISTS [EFCoreIssue27446_01];
CREATE DATABASE [EFCoreIssue27446_01];
ALTER DATABASE [EFCoreIssue27446_01] SET READ_COMMITTED_SNAPSHOT ON;";
await cmd0.ExecuteNonQueryAsync();
}
await using (var conn0 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_01"))
{
await conn0.OpenAsync();
await using var cmd0 = new SqlCommand();
cmd0.Connection = conn0;
cmd0.CommandText = @"
CREATE TABLE [Foo] (Id INT IDENTITY PRIMARY KEY, Bar INT)";
await cmd0.ExecuteNonQueryAsync();
}
await using var conn1 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_01");
await conn1.OpenAsync();
await using var conn2 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_01");
await conn2.OpenAsync();
await using var tx1 = (SqlTransaction)await conn1.BeginTransactionAsync();
await using var cmd1 = new SqlCommand();
cmd1.Connection = conn1;
cmd1.Transaction = tx1;
cmd1.CommandText = @"INSERT INTO Foo (Bar) VALUES (8)";
await cmd1.ExecuteNonQueryAsync();
await using var cmd2 = new SqlCommand("UPDATE Foo SET Bar = 1000", conn2);
var startTime = DateTime.UtcNow;
var task2 = cmd2.ExecuteNonQueryAsync();
await Task.Delay(TimeSpan.FromSeconds(8));
cmd1.CommandText = @"SELECT [Bar] FROM [Foo] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity()";
var selectResult1 = (int?)await cmd1.ExecuteScalarAsync();
await tx1.CommitAsync();
var changedCount2 = await task2;
var cmd2Duration = DateTime.UtcNow - startTime;
Trace.Assert(cmd2Duration.TotalSeconds > 6); // cmd2 had to wait for cmd1 to release its write lock
Trace.Assert(selectResult1 == 8); // this is 8, because cmd2 could only execute after cmd1 released its write lock
Trace.Assert(changedCount2 == 1); // this is 1, not 0, because cmd2 could only execute after cmd1 released its write lock
}
} Using SI instead, the following simple code (that is very close to the initial sample code) will just work and will not contain any unexpected side-effects ( Program.cs: SIusing System.Data;
using System.Diagnostics;
using Microsoft.Data.SqlClient;
namespace MssqlSnapshotIsolation02;
internal static class Program
{
private static async Task Main()
{
await using (var conn0 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False"))
{
await conn0.OpenAsync();
await using var cmd0 = new SqlCommand();
cmd0.Connection = conn0;
cmd0.CommandText = @"
DROP DATABASE IF EXISTS [EFCoreIssue27446_02];
CREATE DATABASE [EFCoreIssue27446_02];
ALTER DATABASE [EFCoreIssue27446_02] SET ALLOW_SNAPSHOT_ISOLATION ON;";
await cmd0.ExecuteNonQueryAsync();
}
await using (var conn0 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_02"))
{
await conn0.OpenAsync();
await using var cmd0 = new SqlCommand();
cmd0.Connection = conn0;
cmd0.CommandText = @"
CREATE TABLE [Foo] (Id INT IDENTITY PRIMARY KEY, Bar INT)";
await cmd0.ExecuteNonQueryAsync();
}
await using var conn1 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_02");
await conn1.OpenAsync();
await using var tx1 = (SqlTransaction)await conn1.BeginTransactionAsync(IsolationLevel.Snapshot);
await using var conn2 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_02");
await conn2.OpenAsync();
await using var tx2 = (SqlTransaction)await conn2.BeginTransactionAsync(IsolationLevel.Snapshot);
await using var cmd1 = new SqlCommand();
cmd1.Connection = conn1;
cmd1.Transaction = tx1;
cmd1.CommandText = @"INSERT INTO Foo (Bar) VALUES (8)";
await cmd1.ExecuteNonQueryAsync();
await using var cmd2 = new SqlCommand("UPDATE Foo SET Bar = 1000");
cmd2.Connection = conn2;
cmd2.Transaction = tx2;
var changedCount2 = await cmd2.ExecuteNonQueryAsync();
cmd1.CommandText = @"SELECT [Bar] FROM [Foo] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity()";
var selectResult1 = (int?)await cmd1.ExecuteScalarAsync();
await tx1.CommitAsync();
await tx2.CommitAsync();
Trace.Assert(selectResult1 == 8);
Trace.Assert(changedCount2 == 0);
}
} Finally, using no transaction to group the Program.cs: RCSI (but only statement wide implicit transactions)using System.Diagnostics;
using Microsoft.Data.SqlClient;
namespace MssqlSnapshotIsolation03;
internal static class Program
{
private static async Task Main()
{
await using (var conn0 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False"))
{
await conn0.OpenAsync();
await using var cmd0 = new SqlCommand();
cmd0.Connection = conn0;
cmd0.CommandText = @"
DROP DATABASE IF EXISTS [EFCoreIssue27446_03];
CREATE DATABASE [EFCoreIssue27446_03];
ALTER DATABASE [EFCoreIssue27446_03] SET READ_COMMITTED_SNAPSHOT ON;";
await cmd0.ExecuteNonQueryAsync();
}
await using (var conn0 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_03"))
{
await conn0.OpenAsync();
await using var cmd0 = new SqlCommand();
cmd0.Connection = conn0;
cmd0.CommandText = @"
CREATE TABLE [Foo] (Id INT IDENTITY PRIMARY KEY, Bar INT)";
await cmd0.ExecuteNonQueryAsync();
}
await using var conn1 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_03");
await conn1.OpenAsync();
await using var conn2 = new SqlConnection(@"Server=.\MSSQL14;Integrated Security=True;Encrypt=False;Database=EFCoreIssue27446_03");
await conn2.OpenAsync();
await using var cmd1 = new SqlCommand();
cmd1.Connection = conn1;
cmd1.CommandText = @"
INSERT INTO Foo (Bar) VALUES (8);
WAITFOR DELAY '00:00:08';
SELECT [Bar] FROM [Foo] WHERE [Id] = scope_identity();";
var task1 = cmd1.ExecuteScalarAsync();
await Task.Delay(TimeSpan.FromSeconds(3)); // ensure cmd1 has been sent
await using var cmd2 = new SqlCommand();
cmd2.CommandText = @"UPDATE Foo SET Bar = 1000;";
cmd2.Connection = conn2;
var startTime = DateTime.UtcNow;
var changedCount2 = await cmd2.ExecuteNonQueryAsync();
var cmd2Duration = DateTime.UtcNow - startTime;
var selectResult1 = (int?)await task1;
Trace.Assert(cmd2Duration.TotalSeconds < 6); // cmd2 was able to run between INSERT and SELECT of cmd1
Trace.Assert(selectResult1 == 1000); // this is 1000, not 8, because cmd2 was able to run between INSERT and SELECT of cmd1
Trace.Assert(changedCount2 == 1); // this is 1, not 0, because cmd2 was able to run between INSERT and SELECT of cmd1
}
}
Between an |
@lauxjpn thanks for looking into this! Regarding locking and RCSI:
I think I'm probably missing something... Compared to my code sample above, your code gives UPDATE 8 seconds of waiting before continuing to do the SELECT and commit the transaction. Unless I'm mistaken, that means that the UPDATE is still blocking, waiting for the X-Lock from the INSERT transaction to be released. After the 8 seconds wait, your code commits the INSERT transaction and so the UPDATE is released. In other words, my point above was only to point out that UPDATE does see the row and lock it, although the transaction creating it hasn't committed yet - and all that with RCSI turned on, and with READ COMMITTED (not READ UNCOMMITTED). That's the weird part for me, and on PG it does not occur. But this has no actual bearing on this discussion on whether INSERT+SELECT is safe: the only important point is that as long as INSERT+SELECT are wrapped in a transaction, an external update cannot interfere with the values the SELECT sees (on SQL Server). |
Closing; as discussed above, there's no safety issue with INSERT+SELECT as long as they're wrapped in a transaction (thanks @GSPP). We'll keep this behavior for trigger mode. |
When inserting a new entity instance which has both identity and other generated values, we current use the INSERT+SELECT pattern:
Although the two statements are wrapped in a transaction, we use the default isolation level, which is read committed; this means that if an update happens between the INSERT and the SELECT, the SELECT returns the updated value for Foo, and not the originally-generated value.
I'm not sure this is an actual problem, but it does seem like we should ideally propagate back the original generated values, and not be sensitive to a race condition such as this. Switching to a single-command method (
SELECT ... OUTPUT
) would remove this problem, but would degrade performance when triggers are present (SELECT ... OUTPUT INTO
) - see #27372 for more info.The text was updated successfully, but these errors were encountered: