Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExecuteAsync and RunInTransactionAsync are not safe together. #1250

Open
trampster opened this issue Aug 18, 2024 · 13 comments
Open

ExecuteAsync and RunInTransactionAsync are not safe together. #1250

trampster opened this issue Aug 18, 2024 · 13 comments

Comments

@trampster
Copy link

trampster commented Aug 18, 2024

If I do an Insert (using an Execute) (which should always change 1 row) inside an async transaction and at the same time do an Update that changes all rows using ExecuteAsync. Then sometimes the Insert will return the rows changed value from the Update.

The following code reproduces it very quickly:

using SQLite;
using RowsChangedIssue;

string databaseFile = "test.db";
if(File.Exists(databaseFile))
{
    File.Delete(databaseFile);
}


var database = new SQLiteAsyncConnection("test.db");

await database.CreateTableAsync<Person>();

var insertTask = Insert();
var updateTask = Update();

await Task.WhenAny(insertTask, updateTask);


async Task Insert()
{
    for(int index =0; index < 20000; index++)
    {
        await database.RunInTransactionAsync(db =>
        {
            var person = new Person()
            {
                Name = $"Person{index}", 
                IsFriend = false
            };
            int rowsChanged = db.Execute(
                @"INSERT INTO Person (Name, IsFriend) VALUES(?1, ?2)",
                person.Name,
                person.IsFriend);

            if(rowsChanged != 1)
            {
                Console.WriteLine($"Added 1 person but rows changed was {rowsChanged}");
                throw new Exception($"Added 1 person but rows changed was {rowsChanged}");
            }
        });
    }
}

async Task Update()
{
    while(true)
    {
        // Changes all rows in the table
        await database.ExecuteAsync("UPDATE Person SET IsFriend=true");
    }
}

Interestingly if you put the Update inside a transaction it also doesn't have this problem. Which is what we have done for update as a work around.

It should be noted that the rows changed comes from a call to sqlite3_changes which happens after the query is completed.

Looking at the code ExectueAsync gets a lock on the connection. But RunInTransactionAsync gets a connection from a pool and locks on that (different connection maybe?). But I'm not sure why that would matter as the transaction should isolate it anyway.

So I'm not clear on why this is happening.

@trampster
Copy link
Author

trampster commented Aug 19, 2024

I've done a bit more digging and discovered the following:

  1. Sqlite has been compiled with threading as Serialised (allows using connection from multiple threads)
  2. Connection has been opened with OpenFlags FullMutex
  3. sqlite-net isn't using it's connection lock because OpenFlags are set to FullMutex
  4. sqlite-net is doing the transaction and the ExectueAsync on the same connection but from different threads

I think 4. is the problem. I've verified this by printing the connection and thread from inside WriteAsync and from in the RunInTransactionAsync callback.

When you begin the transaction using RunInTransactionAsync then the ExecuteAsync call ends up using that transaction if it runs before the transaction ends (its the same connection). So calling sqlite3_changes gets the value from the update.

The reason that always using a transaction works is TransactAsync gets a transactionLock as well as a connection lock, (the connection lock is skipped due to FullMutex but the transactionLock isn't)

The result is that if you mix RunInTransactionAsync calls with ExecuteAsync calls then the ExecuteAsync calls can inadvertently be included in the transaction if they execute while the transaction is open.

@trampster trampster changed the title ExecuteAsync and Execute inside RunInTransactionAsync isn't safe together. ExecuteAsync and RunInTransactionAsync are not safe together. Aug 19, 2024
@ghost
Copy link

ghost commented Oct 3, 2024

Could not reproduce, got it working flawlessly here:

Gist

Things of note:

  • Yes, it uses Godot but that is irrelevant to the actual functional code.
  • It uses the NuGet package's built in Insert function.
  • It will iterate through hundreds of records being added without ever failing. I could let it run forever without it ever failing.
  • This isn't exactly a very effective method of adding a lot of records simultaneously. InsertAll would be a better approach.

I think there might be an issue with the manual insert. Maybe especially when it's async.

@ghost
Copy link

ghost commented Oct 3, 2024

As a proof of concept and a further demonstration it does actually work reliably:

Gist

The same code but this time doing them in bulk, batches of 1000000 records. Still not missing a single beat.

@trampster
Copy link
Author

trampster commented Oct 3, 2024

Can you try again but exactly follow my example code and use Execute to add one at a time and not InsertAll.

I actually don't care if InsertAll works, because that's not what I'm trying to do.

In my use case I'm not actually doing bulk inserts, It's just that we have had this bug several times from the field and the code I posted above reliably reproduces it quickly. (in the actual app it doesn't happen very often (two reports in a year) but that's because we are not inserting/updating at such a high rate.

@trampster
Copy link
Author

InsertAll by default starts a new non async transaction so it's a completely different scenario and not relevant to this bug report.

@ghost
Copy link

ghost commented Oct 6, 2024

InsertAll by default starts a new non async transaction so it's a completely different scenario and not relevant to this bug report.

Read the first comment I made and see the Gist I posted there. It is not a bulk insert. The bulk insert was merely a proof of concept, as I wrote. To illustrate that even massive inserts will not cause the issue you're describing. The point was... the issue is with your code. Not with SQLite-NET nor with SQLite itself. There is no variation I tried where I could replicate the issue you're seeing.

@trampster
Copy link
Author

trampster commented Oct 6, 2024

Your first Gist uses Insert and not Execute as per my bug report (not sure if this matter)

I've created a reproduction github repo, so that it's super easy for you to use exactly the same code and setup as me:

https://github.com/trampster/SqliteNetRowsChangedIssue

Just clone it and do dotnet run

Here are my results:

daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 249
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 6
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 142
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 17
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 20
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 17
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 2
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 155
daniel@dansDestop:~/Work/SqliteNetRowsChangedIssue$

That first failure took about 1 minute. The other failures where quicker, in none of the runs did it work.

If you think it's a problem with how I'm using Sqlite-net then all the code is there please tell me what I'm doing wrong.

@ghost
Copy link

ghost commented Oct 6, 2024

See this Gist

This is your version. Albeit with random data instead of a boolean but, well, that's irrelevant. And... it's not failing.

@trampster
Copy link
Author

Please clone my repo and run it, so you are using exactly the same code and environment as me.

Godot has a single threaded sycronization context which might change things, there could also be other differences

Please I'm begging you to actually run the exact unmodified code from my linked reproduction repo. Just clone and dotnet run change nothing else.

@ghost
Copy link

ghost commented Oct 6, 2024

Please clone my repo and run it, so you are using exactly the same code and environment as me.

Godot has a single threaded sycronization context which might change things, there could also be other differences

Please I'm begging you to actually run the exact unmodified code from my linked reproduction repo. Just clone and dotnet run change nothing else.

Godot would have no effect on this situation, since I am using .NET's Tasks, like you. Godot does not even have Tasks. There is no ambiguity. Anyhow, running your exact repo is indeed failing. I am analyzing it now.

@ghost
Copy link

ghost commented Oct 6, 2024

[ment@mentpc SqliteNetRowsChangedIssue]$ dotnet run
Ran for 600 seconds without failure.

The difference?

await database.ExecuteAsync($"UPDATE Person SET Name='{DateTime.UtcNow}'");

The explanation? Honestly, I have no clue.

Here's my altered version of your code:

Gist

TL;DR -- For some reason, changing the BOOL is causing the failure. There is no problem with any sort of mutex lock on the database between the 2 different ASYNC calls.

@trampster
Copy link
Author

It still fails for me when setting the Name to DateTime.UtcNow as per your change.

It does take a lot more iterations to fail. But this was on my work machine with a high end CPU and a fast SSD so it didn't take long.

The reason it does more iterations to fail is because getting DateTime.UtcNow is a lot slower than a hardcoded bool in the sql string.

hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 1161
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 523
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 72
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 105
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 242
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 1363
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 265
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$ dotnet run
Added 1 person but rows changed was 1133
hughesd@ws5195:~/work/SqliteNetRowsChangedIssue$

@ghost
Copy link

ghost commented Oct 7, 2024

Ran for 360 seconds without failure.
-- Index count: 1421
-- Actual Person count: 1421
Ran for 360 seconds without failure.
-- Index count: 2821
-- Actual Person count: 2821
Ran for 360 seconds without failure.
-- Index count: 4207
-- Actual Person count: 4207
Ran for 360 seconds without failure.
-- Index count: 5510
-- Actual Person count: 5510
Ran for 360 seconds without failure.
-- Index count: 6640
-- Actual Person count: 6640
Ran for 360 seconds without failure.
-- Index count: 7650
-- Actual Person count: 7650
Ran for 360 seconds without failure.
-- Index count: 8700
-- Actual Person count: 8700
Ran for 360 seconds without failure.
-- Index count: 9679
-- Actual Person count: 9679
Ran for 360 seconds without failure.
-- Index count: 10627
-- Actual Person count: 10627
Ran for 360 seconds without failure.
-- Index count: 11445
-- Actual Person count: 11445

While actually changing the BOOL, not the String. The difference? Pulling the rowsChanged check out of the transaction. Where it never should've been in the first place, since the transaction isn't completed yet by that point. As you can see, I added a method of confirming that the number of records is exactly the same as the index (loop) counter. Data consistency would not only require that, you would also expect to see that. And, throughout this 3600 second (1 hour) run, 10 consecutive tests, it remained entirely consistent.

Which means there never was any actual problem, the function always did just add 1 person. rowsChanged simply reported a different number because the transaction had not yet completed. So, I stand by my original conclusion -- There is no problem running these 2 different ASYNC methods side by side. The index and actual person counters confirm this.

Gist of this version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant