-
Notifications
You must be signed in to change notification settings - Fork 26
Description
The SQLite VFS is at the lowest level of how SQLite stack. You should read the article here https://www.sqlite.org/vfs.html if you want to know more. We'll be talking about this, and also the WAL Index ( https://www.sqlite.org/walformat.html#shm )
For the purposes of this issue, here's what you need to know:
By default SQLite uses the unix VFS. To allow multiple processes to access the database at the same time, SQLite memory-maps a shm file. If you're curious, SQLite doesn't actually care about this data being persisted at all:
Since the content of the shm file does not need to be preserved across a crash, the shm file is never
fsync()-ed to disk. In fact, if there were a mechanism by which SQLite could tell the operating system to never persist the shm file to disk but always hold it in cache memory, SQLite would use that mechanism to avoid any unnecessary disk I/O associated with the shm file. However, no such mechanism exists in standard posix.
The unix VFS also does not acquire any locks on the database file itself, so each connection has to assume some other process may be changing it at the same time.
Anyways, all this allows multiple processes to access the database simultaneously, but introduces some overhead. For crucible, we don't care about multiple processes accessing our extent DBs simultaneously, and in fact, we'd really prefer they didn't.
So, we can switch to the unix-excl VFS. Under this, a process acquires an exclusive lock on the DB file when it connects to the database, and the shm file is eliminated because it's no longer needed. This results in a nice bump in perf, and as a bonus adds a barrier to multiple processes opening the same extents at the same time (perhaps we'd like more such barriers in future?)
(sorry for messing up your graph bryan, but i could not see that top line on my monitor because it is from 2003)
What this DOESNT do
This doesn't forbid multiple connections within the same process from accessing the DB. Instead, synchronization between mulitple in-process sqlite connections is just moved to the heap. However, we could also set the db to exclusive locking mode. https://www.sqlite.org/pragma.html#pragma_locking_mode , which would forbid multiple connections even in-process, and might give us some performance. See #774 for tracking that idea
