Possibility of losing commits when polling in high concurrency scenarios with SQL Server #21

fschmied · 2016-04-17T17:28:02Z

As explained in this dba.stackexchange.com answer, the IDENTITY-based checkpointing mechanism used by the SqlPersistenceEngine with MsSqlDialect is not safe: commits could be skipped when using GetFrom or CommitPollingClient under high load. The probability of running into this scenario is probably very low, but it does exist.

See also NEventStore/NEventStore#425 for a discussion of possible solutions.

The text was updated successfully, but these errors were encountered:

mgevans · 2016-07-27T17:56:25Z

I've also seen this issue on PostgreSQL.

fschmied · 2016-08-04T06:18:07Z

@mgevans In SQL Server, the scenario is probably very unlikely to occur, as the window of opportunity is very, very small if you use the right isolation level (lock-based READ COMMITTED without the READ COMMITTED SNAPSHOT option).

I don't really know PostgreSQL, but if there is a similar way of using lock-based READ COMMITTED vs. snapshot-based READ COMMITTED, that might be a way of avoiding it (or making it less likely) on PostgreSQL as well.

fschmied · 2016-12-01T08:05:51Z

Hi Peter, Do you happen to know whether the READ COMMITTED SNAPSHOT option was on or off on your production database? I was under the impression (based on my analysis and tests) that the behavior you describe only applies with READ COMMITTED SNAPSHOT being on. With READ COMMITTED SNAPSHOT on, a reader in your example will see the row inserted by Thread B even if Thread A hasn't committed yet. The problem becomes more likely, as you say, with a transaction scope. With READ COMMITTED SNAPSHOT off, a reader in your example will usually block as soon as Thread A inserted a row (because it blocks on the lock taken by the insert). There is a very tiny race condition, where the reader doesn't see the row because the lock isn't taken yet. This, however, is not affected by how long it takes to commit the transaction, the race is only between the two INSERTs and the reader. Thread A generates an IDENTITY value 1. Thread B generates an IDENTITY value 2. The reader reads past the place where CommitSequence 1 will be placed. Thread A inserts the commit (and takes a lock) with CommitSequence: 1. Thread B inserts the commit (and takes a lock) with CommitSequence: 2. The reader now tries to read the row with CommitSequence 2. ... So, did you see the behavior with READ COMMITTED on or off? I'm asking because I know it's extremely likely to occur with READ COMMITTED SNAPSHOT on; but I'm still trying to determine how likely the problem is to occur wuth RCS off :) About the TransactionScope: I think that scope inside ExecuteCommand is there mostly to suppress ambient transactions if the EnlistInAmbientTransaction option is set to false (which it usually is). In fact, as soon as EnlistInAmbientTransaction is enabled, the scope becomes a) unnecessary and b) a problem ( #12). If you remove it, NEventStore would however enlist in ambient transactions by default, which is probably not a good change. About your fix: There are a few use cases for running NEventStore with SQL Persistence with EnlistInAmbientTransactions enabled; so IMO any change in there should keep those use cases in mind. In particular, taking an application lock for the whole transaction is probably not the best idea if EnlistInAmbientTransactions is used because the transaction could live quite "long" and the application lock would thus reduce concurrency a lot. I think that as long as READ COMMITTED SNAPSHOT is off, it should be enough to hold the app lock around the INSERT statement. I'm not sure if this could increase the likelihood of deadlocks, though.

…

On Wed, Nov 30, 2016 at 3:31 PM, Peter Stephenson ***@***.***> wrote: Hi, We've hit this problem several in production and we think we have a potential fix for SQL Server at least (I'm afraid I don't know the other technologies very well!) The issue seems to be down to a race condition when generating IDs (SQL Server does this just prior to the insert when using an identity) and completing the transaction. It's possible for the polling client to read until latest with some missing. 2 simultaneous threads operate as follows: Thread A inserts a commit with CommitSequence: 1 Thread B inserts a commit with CommitSequence: 2 Thread B completes its transaction Thread A completes its transaction The transaction scope used around all commands actually makes the issue far more likely (as opposed to it not being there) as the time between the ID generation and the row being committed is increased substantially. READ COMMITTED wouldn't help in this case... We've done some experimentation with SQL server directly and noted that adding an exclusive application lock for the duration of the write (thus serialising write operations) completely resolves this issue, however with the transaction scope in play it can take a while to release the lock. As each of the providers seem to only use a single statement for this operation, I would argue the transaction scope could be removed, the transaction moved to the server side code and a serialising lock taken. I'm not familiar enough with the other technologies to complete this fix for all of them, however. The rough idea can be seen here: https://github.com/ASOS/ NEventStore.Persistence.SQL/tree/fix-mssql-race-condition I'm willing to pull request this accross, but have a few questions: - Do you think the TransactionScope is definitely necessary? - Is anybody able to suggest implementations for the other providers? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAIhLNanAeTYuJWr-LhhXTrh2gyYa_5Oks5rDYjWgaJpZM4IJNlN> .

PeterStephenson · 2016-12-01T13:22:51Z

Hi, We actually realised that Read Committed Snapshot was on shortly after I posted that and I removed the post as it was misleading, sorry!

fschmied · 2019-08-09T06:14:24Z

See #14 (comment) about READ COMMITTED SNAPSHOT and Azure SQL.

cecilyth · 2024-12-04T16:34:56Z

We are experiencing this in our production environment. Are there any suggestions for fixing this? We've implemented the READCOMMITTEDLOCK change ea686a1#diff-9ec321712e759f586acc6e7a5b53661f1f7cadf2e6b6faade5e6e7db28f98f30
but still occasionally miss commits in our pollers.

for example here we pulled out 3227030099 and 3227030104, but not 3227030103, which is in the same bucket.

Update: We ended up implementing something similar to the suggestion in this comment. So far it's working well.

fschmied · 2024-12-16T17:22:01Z

@cecilyth Would it be possible for you to share the code of your solution? I think if READCOMMITTEDLOCK is in place, using NOLOCK wouldn't actually make much difference as this is not about a lock being skipped, but an IDENTITY increment AND another full transaction happening before the lock in question is being taken. (See dba.stackexchange.com discussion linked above.) However, simply repeating the query might reduce the likelihood of the problem occurring even more (with NOLOCK or without).

Also, I'd be interested: Are you 100% sure the READCOMMITTEDLOCK hint was included in the relevant GetFrom query? I'm asking because we're also using NEventStore in production with quite a lot of load, but have never seen skipped commits. I was under the impression that with the locks in place, the chance of skipping was very, very low. If you're seeing it occasionally despite locking, that might be wrong and I'd like to know 😅.

cecilyth · 2024-12-16T21:00:44Z

Here's the narrative for how our journey with this bug went. We're a summer camp registration company, so have very heavy spurts of load during a registration event, and the Azure SQL database for our eventstore is just over 3 TBs with almost 2 billion rows in the Commits table.

We originally released our eventstore upgrade (going from version 3 of the original Eventstore repo to the latest version of NEventStore) in late August. We fairly quickly saw the pollers skipping commits, implemented the READCOMMITTEDLOCK hint, and patched that out to our production environment in the first week of September. The rate at which we were "skipping" went down drastically after that release, to the point where it happened maybe once or twice a month and we had to dig pretty deep to find any instances at all. I can say that the READCOMMITTEDLOCK hint was definitely added to the query, but I can't say with certainty that it was included in the query plan. It did however seemingly fix our skip issue.

Unfortunately, shortly thereafter we realized we had an issue with the primary key on the Commits table. Our primary key included the CheckpointNumber, meaning we never got a primary key violation (due to the IDENTITY always being unique) and thus never got any ConflictingCommandExceptions. This meant we had streams with multiple of the same CommitSequence, and some of the commits conflicted with one another since they weren't retried or detected as conflicting.

In late November we changed the primary key on the Commits table, dropping CheckpointNumber from the key. Unexpectedly, when we did this we started seeing tons more skipped commits, anywhere between 10-100 per day, and it was drastically impacting our system stability and data proliferation during periods of high load. Another thing we noticed was that the performance of querying commits in the poller got about 10 times faster after the PK change.

Sequencing the commits on the client side wasn't really an option for us, as we're a multi tenant system and each commit Bucket for us corresponds to a different client, meaning it was expected to have holes in the CheckpointNumber while polling commits per bucket, and we'd have no way of knowing which holes were just checkpoints that belong to a different bucket, or if they had been truly skipped.

I implemented new functionality in SqlPersistenceEngine as well as MsSqlDialect to try to use the Count(*) with NOLOCK suggestion that I linked in my previous comment. In our load tests, the NOLOCK hint seemed to do nothing - we were consistently counting the same number of commits in the NOLOCK query as we were pulling out with the READCOMMITTEDLOCK hint on the GetFrom query. I then refactored a bit to open a transaction with READUNCOMMITTED isolation level and then do the count. Load testing revealed that this did frequently result in a different count (eg. 10 uncommitted count but only 9 commits pulled on first query), and requerying commits until the two counts aligned prevented us from seeing any symptoms of skipped commits in our load tests. It's possible that just doing the second query was enough of a delay, but I was never getting a difference in count in the first implementation (NOLOCK hint), whereas it was fairly easy to see happen at about the rate we were expecting after refactoring to set the transaction isolation level specifically.

We deployed to production last week and haven't seen any skipped commits since. Finally, we seem to be in a stable place with the primary key change, read committed lock for the GetFrom, and uncommitted count. Our code for the READCOMMITTEDLOCK change is exactly what's in the changeset that I linked above. Unfortunately, I'm not able to share our exact implementation of the uncommitted count, but it's basically this:

in MsSqlDialect:

public const string GetCountFromToBucketAndCheckpoint = "SELECT Count(*)\r\n  FROM Commits\r\n WHERE BucketId = @BucketId \r\n  AND CheckpointNumber > @FromCheckpointNumber  \r\n  AND CheckpointNumber <= @ToCheckpointNumber;";

public IDbTransaction OpenReadUncommittedTransaction(IDbConnection connection)
{
	return connection.BeginTransaction(System.Data.IsolationLevel.ReadUncommitted);
}

in SqlPersistenceEngine:

public int GetCountUncommitted(string bucketId, long fromCheckpoint, long toCheckpoint)
{
	return ExecuteReadUncommittedCommand((IDbStatement cmd) =>
	{
		string getCountFromBucketAndCheckpointUncommitted = _dialect.GetCountFromToBucketAndCheckpoint;
		cmd.AddParameter(_dialect.BucketId, bucketId, DbType.AnsiString);
		cmd.AddParameter(_dialect.FromCheckpointNumber, fromCheckpoint);
		cmd.AddParameter(_dialect.ToCheckpointNumber, toCheckpoint);
		return cmd.ExecuteScalar(getCountFromBucketAndCheckpointUncommitted).ToInt();
	});
}

private T ExecuteReadUncommittedCommand<T>(Func<IDbStatement, T> command)
{
	return ExecuteReadUncommittedCommand((IDbConnection _, IDbStatement statement) => command(statement));
}

protected virtual T ExecuteReadUncommittedCommand<T>(Func<IDbConnection, IDbStatement, T> command)
{
	using (TransactionScope transactionScope = OpenCommandScope())
	using (IDbConnection dbConnection = _connectionFactory.Open())
	using (IDbTransaction dbTransaction = _dialect.OpenReadUncommittedTransaction(dbConnection))
	using (IDbStatement arg = _dialect.BuildStatement(transactionScope, dbConnection, dbTransaction))
	{
		try
		{
			T val = command(dbConnection, arg);
			dbTransaction?.Commit();
			transactionScope?.Complete();
			return val;
		}
		catch (Exception ex)
		{
			if (!RecoverableException(ex))
			{
				throw new StorageException(ex.Message, ex);
			}

			transactionScope?.Complete();
			throw;
		}
	}

	bool RecoverableException(Exception e)
	{
		if (!(e is UniqueKeyViolationException))
		{
			return e is StorageUnavailableException;
		}

		return true;
	}
}

fschmied · 2024-12-17T09:45:58Z

Wow, thanks a lot for the detailed description! ❤️

So, you were still rarely seeing skipped commits with your very high load even after introducing the READCOMMITTEDLOCK hint in September, albeit much, much fewer than without the hint (which would be very much expected) - that's very valuable info to me.

It also makes some sense to me that the skip likelihood increased a lot after removing the CheckpointNumber from the Primary Key and thus, I assume, the CLUSTERED index, as the GetFrom query for pulling commits is acting on that very same index (if you're pulling by checkpoint number, which would be the default approach). So, the whole synchronization between commit pulling and creating new commits usually takes place on that index.

(BTW, there's usually another UNIQUE index called IX_Commits_CommitSequence that I think is responsible for avoiding duplicate CommitSequence values and for triggering the ConflictingCommandException. If that didn't work, that index must have been deleted or gotten corrupted somehow? Maybe restoring that would be/have been an alternative solution to your issue.)

Thanks also for the details about your workaround! The bad news is I think it still doesn't solve the original issue in theory, but in practice it's probably enough to get a stable solution.

In case you're still interested: The theoretic issue (as I understand it) is the following sequence of events, caused by a race involving (at least) three parallel threads. Threads A and B are writing commits, thread C is pulling commits, with a query like WHERE CheckpointNumber > 16 and with READCOMMITTEDLOCK.

Thread A: Get next CheckpointNumber (IDENTITY) => 17
Thread B: Get next CheckpointNumber (IDENTITY) => 18
Thread B: Take ROWLOCK on CLUSTERED index entry for value 18
Thread C: Run into ROWLOCK of step 3, block due to READCOMMITTEDLOCK.
Thread A: Take ROWLOCK on CLUSTERED index entry for value 17
Not caring about the order, Threads A and B insert and commit their rows.

As Thread C is already blocked on value 18 in step 5, it will skip the value 17 written in step 7.

With your solution, you still have the race condition in theory - if thread C performs its second (read uncommitted) query before Thread A gets to insert its row, it will return the same count as in step 5 and still skip value 17. The problem is inherent because the next CheckpointNumber value is incremented independently of the rows becoming visible or locks being taken. In practice, performing the second query is probably enough to allow Thread A to proceed to step 7, so you can mitigate it ("by chance").

The only other fix I know that can solve this issue is to pause when you see a gap while pulling events, but even that can only be done heuristically (how long to wait?), and it will negatively affect performance if you have lots of aborted transactions (e.g., due to high concurrency on the same streams) :( .

cecilyth · 2024-12-17T14:04:27Z

It's a good theory about that UNIQUE index somehow being corrupt or missing somewhere! I'll look into that. I totally understand your example and based on all the talk of the issue being very very rare, I was suspicious of it being the cause of our problems. One thing I'm not totally getting yet is why we see the mismatch in count when querying uncommitted, and re-querying that range result in more commits getting pulled. I guess in your example, it's because by the time we count the uncommitted, we've hit step 5/thread C has taken the rowlock on 17?

I really appreciate your response!

fschmied mentioned this issue Apr 17, 2016

Recreate the correct Sequence in CommitPollingClient NEventStore/NEventStore#425

Open

davoad mentioned this issue Aug 18, 2016

Non-deterministic Position values when subscribing to All Streams. SQLStreamStore/SQLStreamStore#31

Closed

ondrejpialek mentioned this issue Oct 12, 2018

Journal reader missing events akkadotnet/Akka.Persistence.SqlServer#102

Open

fschmied mentioned this issue Aug 9, 2019

Possible Azure SQL #14

Open

fschmied mentioned this issue Aug 9, 2019

Make MsSqlDialect compatible with AzureSql and READ COMMITTED SNAPSHOT #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibility of losing commits when polling in high concurrency scenarios with SQL Server #21

Possibility of losing commits when polling in high concurrency scenarios with SQL Server #21

fschmied commented Apr 17, 2016

mgevans commented Jul 27, 2016

fschmied commented Aug 4, 2016

fschmied commented Dec 1, 2016 via email

PeterStephenson commented Dec 1, 2016

fschmied commented Aug 9, 2019

cecilyth commented Dec 4, 2024 •

edited

Loading

fschmied commented Dec 16, 2024

cecilyth commented Dec 16, 2024 •

edited

Loading

fschmied commented Dec 17, 2024

cecilyth commented Dec 17, 2024 •

edited

Loading

Possibility of losing commits when polling in high concurrency scenarios with SQL Server #21

Possibility of losing commits when polling in high concurrency scenarios with SQL Server #21

Comments

fschmied commented Apr 17, 2016

mgevans commented Jul 27, 2016

fschmied commented Aug 4, 2016

fschmied commented Dec 1, 2016 via email

PeterStephenson commented Dec 1, 2016

fschmied commented Aug 9, 2019

cecilyth commented Dec 4, 2024 • edited Loading

fschmied commented Dec 16, 2024

cecilyth commented Dec 16, 2024 • edited Loading

fschmied commented Dec 17, 2024

cecilyth commented Dec 17, 2024 • edited Loading

cecilyth commented Dec 4, 2024 •

edited

Loading

cecilyth commented Dec 16, 2024 •

edited

Loading

cecilyth commented Dec 17, 2024 •

edited

Loading