-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate and optimize SQLite connection pool management #663
Comments
I think the first mystery is: where do all these subprocesses come from? |
No great insights yet, but I've made progress tracing these blocks of syscalls to where they're coming from in Tiled. Ultimately the subprocesses are opened by the This process is created by the
This process is created by the
This is one of the "original" child processed Tiled creates, presumably opening the directory to update the data (I believe unrelated to the problem we're chasing).
At this point This possibly comes from another db commit, although seemingly sharing the previous process from above. Also in
And another matching refresh immediately following on line 667.
Here there is another apparent call to ... followed by another call to
This is that original child process again, which I still presume is just normal operation of updating the data.
|
This may be the issue: pre-2.0 sqlalchemy did not use connection pooling by default for sqlite databases. As of 2.0, it should by default use The interesting tidbit is that this is apparently not the case.
It's not clear why the pool is not being set automatically. Supposedly we are using version >=2.0. |
Oh - it seems that aiosqlite is a bit different, and defaults to So, mystery solved. I guess the other thing to do, is consider if there are any unintended ill effects of using connection pooling with aiosqlite. |
Reading through various things, I'm having trouble understanding whether an application should close the SQLite file after a transaction or hold it open. I'm also unsure about where the subprocesses, the various pids, are coming from. It does not seem that aiosqlite uses |
Agreed that the specific thing opening these subprocesses is mysterious. I've briefly combed through sqlalchemy and aiosqlite and haven't found anything that looks suspicious. What's also strange is that strace does not pick up anything (surrounding these processes) for |
Okay, I'm becoming convinced that this is A trivial test:
Both connections result in new PIDs:
This happens even if you open a connection to a database you already have open, as aiosqlite has no built in pooling/reuse. Their documentation also explains this, if I'm understanding correctly:
|
Even more convincing:
edit: and to satisfy my own curiosity, it seems
|
In trying to gather some of the community consensus re: pooling sqlite connections, here is an attempted summary of that plus related details.
Some links to the more worthwhile stuff:
The most robust thing to do is probably benchmarking whether this is worth all of the thinking, but if I had to make a snap decision from all of this, I lean towards trying out the AsyncQueuePool & not constantly respawning processes/threads, and chattering file IO, and rebuilding the db cache, etc. |
This is so helpful! Good find with I agree with your assessment. I think we should next add some benchmarks that hit the database pretty hard to our nascent benchmark suite. (See #641.) Using those, we should verify that switching to AsyncQueuePool does not result in any unexpected regressions. If that goes well, resulting in a tie or better with NullPool performance, we should switch, because "less resource thrashing" is a good default position to take while we collect more information about our usage patterns, construct appropriate benchmarks, and identify bottlenecks across the service. I suspect the issues with file locking mentioned on the SQLAlchemy issue tracker were from users that were not taking SQLite's advice about keeping SQLite to local storage. |
In a group-coding session, an asv benchmark validated that |
Closed by #710 |
Due to aiosqlite's default, NullPool was used for sqlite. This is a suboptimal setting when spawning many sessions as we do. (See bluesky/tiled#663) Also increase busy_timeout to 30s. It allows getting rid of "database is locked" when stopping many runs (e.g. 20 at a time).
* Increase runner submitWaitDuration to 5m * Fix potentially long write DB transactions * Set PRAGMA synchronous=NORMAL * Fix _wait_to_lock_many() * Implement batch background processing * Add Server limits to Server deployment * Fix reuse instances lock * Use AsyncAdaptedQueuePool for sqlite Due to aiosqlite's default, NullPool was used for sqlite. This is a suboptimal setting when spawning many sessions as we do. (See bluesky/tiled#663) Also increase busy_timeout to 30s. It allows getting rid of "database is locked" when stopping many runs (e.g. 20 at a time).
We have observed what appears to be frequently (re-)opening of the SQLite catalog database. I have read that applications should hold a connection pool with one read-write connection and N read connections.
At minimum, we should avoid thrashing the file like this (unless I am confused). At best, we should optimize our pool management.
Demo:
Note that this has to be warmed up, as Tiled lazily imports some Python modules.
The text was updated successfully, but these errors were encountered: