OPFS access handles to be more synchronous #67

rhashimoto · 2022-11-26T15:47:33Z

rhashimoto
Nov 26, 2022
Maintainer

In the original proposal for access handles, synchronous methods were provided for read and write, but not for other methods like file size, truncate, and flush. Recently, however, the access handle proposed spec has been changed to make all the FileSystemSyncAccessHandle methods synchronous. Apparently this will ship in Chrome 108 which goes stable next week (!). The Firefox response has been positive, so I expect they will ship that in their first OPFS release, whenever that is. ~~No word from WebKit yet.~~ Update 12/21/22: WebKit PR. UPDATE 5/7/23: Supported in Chrome 108+, Safari 16.4+, Firefox 111+.

This still isn't enough to build a fully general and fully synchronous SQLite VFS, i.e. one that would bypass the issues of connecting WASM to asynchronous Javascript APIs. Those issues are currently solved with either Asyncify (e.g. wa-sqlite) or SharedArrayBuffer (e.g. absurd-sql and the official SQLite WASM project), and both techniques have drawbacks. The reason it's not enough is that acquiring an access handle (with createSyncAccessHandle) will remain an asynchronous operation so the VFS can't open or create an arbitrary file synchronously.

However...the updated API might be sufficient to build a synchronous VFS with a pre-opened database, i.e. one that acquires all its OPFS access handles for database and journal files (and probably some extras to be temporary files) before calling SQLite. That could be used with a synchronous SQLite WASM build (so no Asyncify or SharedArrayBuffer needed). Such a VFS wouldn't support multiple connections to the same database (this might be addressable by applications using SharedWorker), but should be small, fast, and simple. I wonder if this might be the sweet spot for SQLite persistence right now.

If only there were a framework available to experiment with new VFS ideas for the browser...

rhashimoto · 2022-11-28T00:24:35Z

rhashimoto
Nov 28, 2022
Maintainer Author

This still isn't enough to build a fully general and fully synchronous SQLite VFS, i.e. one that would bypass the issues of connecting WASM to asynchronous Javascript APIs.

I now believe this statement is wrong. I believe it is possible to build a fully general and synchronous SQLite VFS with the new OPFS spec, one that would work without Asyncify and without SharedArrayBuffer. The basic concept is instead of exposing the OPFS file system to SQLite, you implement a file system using OPFS as a device. The simplest example is to use a single OPFS file as a block device and port a real file system on top of it, inodes and all that. Whether that is practical is another story - I think probably not. I don't know that much about real file systems at the device level, but it seems like there would be a lot of overhead and you might need to do an fsck equivalent every time you start it up, not to mention that it sounds like an awful lot of code. Nevertheless, I think the concept is a great starting point.

I have an idea for a less general but much simpler implementation (that would still be synchronous to SQLite). The idea is to create a pool of OPFS access handles that always remain open and can be reused as needed. This caps the total number of files in the file system (not just the number of open files) to the number of access handles, which can be configured by the user. This should work for most SQLite applications because while you can create/open an arbitrarily large number of database files in SQLite (e.g. with 1,000 ATTACH statements), users typically don't (edit: actually by default the limit is 10). And although SQLite creates more files than you might think while executing, it's still typically a manageable handful per database (2-6 would be my estimate). Deleted files don't count so it's just a cap on the maximum number of files at any point in time. And the cap could be raised (or lowered) at any time as long as it is not during the execution of a SQLite call, so if a call to SQLite reports an I/O error due to too many files then the cap could be raised and the call retried.

Here's a sketch of how I think this can be done:

All the file system files belonging to the VFS will go into one OPFS directory - remember, we're not exposing the OPFS directory structure to SQLite. Each OPFS file will have a randomly generated name, and the number of these files will be the cap. The cap will be increased by adding more OPFS files and decreased by removing unassigned OPFS files.

SQLite doesn't do anything with directories so file paths can essentially be treated as keys. Any file that SQLite creates will be uniquely associated with one of the OPFS files. That association is made by writing the path at the beginning of the OPFS file (this header will also contain some other fields described later). After the header will be all the data SQLite reads and writes.

When the VFS starts, it acquires access handles for all the OPFS files, reads all the headers, fixes any problems, and caches the associations with paths in memory. When xOpen is called, the VFS provides the access handle from an existing association if there is one for that path, or creates a new association with an unassociated access handle. All the subsequent calls on the newly opened file object - xWrite, xRead, xTruncate, etc. - are implemented using the access handle.

The tricky part is ensuring that if the VFS environment is interrupted at any point - e.g. by Worker termination, tab close, power failure, etc. - the file system is left in a recoverable state. SQLite journaling will take care of anything involving data that SQLite writes, but we need to be very careful about everything else. For example, if we open a transaction on the SQLite database file "/path/foo.db", and it starts to create a journal file "/path/foo.db-journal", what happens if it is interrupted after only writing "/path/foo.db" to the header? The next time the VFS starts, we need to have some way to know which OPFS file is the database and which OPFS file is a failed file creation.

One way to ensure OPFS file associations can recover from failures is to add a digest of the path to the header. When an association is created or cleared, both the path and the digest are written and the OPFS file is flushed. When the VFS starts, any OPFS file with an incorrect digest should be considered unassociated and repaired to reflect that state.

Such a VFS wouldn't support multiple connections to the same database

It is likely true that SQLite can't manage multiple connections itself with a VFS like this, but an application that wanted/needed multiple connections could implement its own locking arbitration above SQLite. The unlocking instance would need to release all its access handles and the locking instance would need to acquire all the handles, so it might be fairly heavy and how heavy would scale with the number of handles. A promising alternative to multiple connections is for contexts to share a single connection, as discussed here and here.

Questions and comments welcome, especially regarding crash scenarios that might not be handled properly.

0 replies

rhashimoto · 2022-11-29T17:03:16Z

rhashimoto
Nov 29, 2022
Maintainer Author

AccessHandlePoolVFS works (previously named SyncAccessHandleVFS)! On Chrome canary, anyway. The Chrome stable channel will push 108 beginning today. Edge should follow soon I expect, being based on Chromium, Firefox activity shows them working on it, and Safari uptake is still TBD.

The implementation follows the above sketch quite closely (after I changed the sketch to use a digest). Performance looks pretty promising - here's a comparison of the new VFS (on the left) versus the older OPFS VFS (that requires Asyncify) on the right (I disabled locking for the old VFS because AccessHandlePoolVFS doesn't implement locking):

I did need to correct the xDeviceCharacteristics flags (which were too aggressive) in the old VFS to make it a fair fight. When they both return the same flags to tell SQLite they provide the same storage guarantees, the new AccessHandlePoolVFS is faster across the board.

That was expected, but how does it do against IndexedDB? IndexedDB can provide stronger storage guarantees, and in addition IDBBatchAtomic doesn't need an external journal file to be ACID-compliant in typical usage (like what the benchmark tests) which is a huge advantage. Here's that result (locking again disabled for a fairer comparison):

AccessHandlePoolVFS gets smoked in the first test, which mainly measures write transaction overhead (maybe the difference is journal file flushing?). However, it is even or faster on every other test, often quite a bit faster. These results are preliminary, but very encouraging.

I'm leaving this in a branch for now. I've only exercised it by manually typing in queries and by running the benchmarks. I don't have the unit tests going, and with minimal browser support for the moment I'm not motivated to do that right away. I mainly wanted people to be able to check out the code and see what was involved.

I think this will be a very promising approach as the modified API becomes better supported. It is best suited for managing a small number of potentially large files, which is the SQLite scenario. It does have the drawback of not supporting locking within SQLite - applications will have to provide locking at a higher level if they need it - but the ability to bypass both Asyncify and SharedArrayBuffer is a big win. It should be possible to build an Emscripten filesystem (as done in sql.js-httpvfs and absurd-sql) along the same lines which then could work with just about any Emscripten-built SQLite (or other applications with similar file usage); it won't be me but I'd love to see it.

9 replies

francescotescari Jul 19, 2023

Thanks for all the info!

You might change your mind once users cry that they can't use it from two tabs at once ;).

Luckily we currently persist data from a single shared worker already, so that shouldn't be an issue for us.

rhashimoto Jul 19, 2023
Maintainer Author

It also means no library-level concurrency, so if you open your apps in two tabs at once, the second will not work.

Only if you don't handle multiple contexts at the application level, which you can.

It's a massive, painful trade-off, IMO.

I think it would be painful to shoehorn into a lot of existing applications. If you're designing a new application and understand the constraints and mitigation strategies (like this one) up front, I don't think it's so bad. There are demos using AccessHandlePoolVFS that I have run on as many as 32 browser tabs simultaneously. I think that is better than the current OSW OPFS VFS typically manages.

sgbeal Jul 19, 2023

I think that is better than the current OSW OPFS VFS typically manages.

By an order of magnitude. Three tabs in the default configuration seem to work consistently so long as nobody does any long-running work, and maybe 5-6 if the so-called "unlock-asap" option is enabled, but that comes with a major performance hit (on the order of 400% for the speedtest1 benchmarks). The most i've seen/tested is 10 at once, but that many can't consistently cooperate for even short periods in our setup. i'll take a closer look at yours demos my current todo queue runs low. After this VFS is out the door i'm obligated to try another one using JSPI (JS Promise Integration).

rhashimoto Jul 19, 2023
Maintainer Author

It's true that the access handle pool approach doesn't support multiple simultaneous connections. It's true that sharing a single connection takes some extra thought and work. However, it's demonstrably not true that it categorically will not work with multiple tabs.

sgbeal Jul 20, 2023

However, it's demonstrably not true that it categorically will not work with multiple tabs.

i'm talking only about the as-delivered state of the library (to be clear - my impl, not yours). Without the addition of client-side concurrency support and/or hypothetical library-level features, it cannot work from multiple tabs. (Edit: where "work" means only "share the same database concurrently.")

rhashimoto · 2022-12-13T23:36:30Z

rhashimoto
Dec 13, 2022
Maintainer Author

To work in production, AccessHandlePoolVFS needs some way to remove obsolete paths that are associated with files in its pool. These obsolete paths would typically be temporary files that weren't deleted, e.g. when a tab or browser is closed while a query is executing. If they aren't removed, eventually these paths will take up too many of the files in the pool and more will be needed. Although more files can be added to the pool, this creates more overhead and wastes space.

We could remove these paths each time we reattach the file system. The question is how do we identify them?

My current idea is to store the flags argument passed to xOpen when SQLite creates a file in the OPFS file header (along with the path). The flags argument includes the type of the file, and the only ones we need to preserve are SQLITE_OPEN_MAIN_DB, SQLITE_OPEN_MAIN_JOURNAL, and SQLITE_OPEN_SUPER_JOURNAL (SQLITE_OPEN_WAL would be important in other cases but not here). Any OPFS file with another type can be reset to unassociated, along with any file with SQLITE_OPEN_DELETEONCLOSE.

It will be important to include the newly added OPFS file header data in the digest computation that allows the VFS to distinguish valid and invalid headers.

Cleaning out the file system garbage is particularly important for AccessHandlePoolVFS, but it probably also makes sense for any other VFS with persistence. So keeping the creation flags in the metadata for most other VFS classes seems like a good addition for purposes of cruft removal. At this writing no wa-sqlite example VFS does that.

Update 4/16/23: AccessHandlePoolVFS now disassociates unexpected files on start in the breaking-changes branch, which will eventually be merged to the trunk.

0 replies

randName · 2024-01-04T06:35:00Z

randName
Jan 4, 2024

I'm not sure how much further this idea will be pushed, but i'm currently trying out a newer implementation with some key differences:

use a module singleton instead of classes
keep free handles in an array, and if that queue has only 1 or 2 handles left just create a new one (asynchronous, but it should not affect other operations). it should also be possible to release unused handles after some time
store the fs metadata in another file, which is flushed occasionally. most operations should just work with the metadata in-memory. this file does not need to be permanently held open since the flushing can be asynchronous, but there should be no harm holding on to it for better guarantees
use the original db filename for a file with OPEN_MAIN_DB. combined with the above it means a db file in the OPFS should correspond with a real db file and be usable directly
allow a db file to be opened in non-exclusive read-only mode by copying the actual file into the SAH pool

I've not started on any of the above points yet so I could be wrong about stuff, but hopefully it works out

2 replies

rhashimoto Jan 4, 2024
Maintainer Author

I'm all for people writing their own VFS to suit their specific needs! I particularly like detaching the metadata from the file - you might look here for my unimplemented and untested ideas on that.

Note that changes are coming to the web platform that may eventually change some of the tradeoffs a VFS needs to balance. For example, multiple readers/writers for FileSystemSyncAccessHandle will be in the next Chrome stable release.

randName Jan 5, 2024

seems like transparency is pretty much what I was describing. like you I have no immediate need for it but it is certainly nice to have, since dealing with the header offsets feels more complicated

the only other far-ahead concern is that when sqlite becomes relatively popular and in use by many libraries, that multiple pools are still able to work on the same OPFS (regardless of transparency)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OPFS access handles to be more synchronous #67

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 11 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

OPFS access handles to be more synchronous #67

rhashimoto Nov 26, 2022 Maintainer

Replies: 4 comments · 11 replies

rhashimoto Nov 28, 2022 Maintainer Author

rhashimoto Nov 29, 2022 Maintainer Author

francescotescari Jul 19, 2023

rhashimoto Jul 19, 2023 Maintainer Author

sgbeal Jul 19, 2023

rhashimoto Jul 19, 2023 Maintainer Author

sgbeal Jul 20, 2023

rhashimoto Dec 13, 2022 Maintainer Author

randName Jan 4, 2024

rhashimoto Jan 4, 2024 Maintainer Author

randName Jan 5, 2024

rhashimoto
Nov 26, 2022
Maintainer

Replies: 4 comments 11 replies

rhashimoto
Nov 28, 2022
Maintainer Author

rhashimoto
Nov 29, 2022
Maintainer Author

rhashimoto Jul 19, 2023
Maintainer Author

rhashimoto Jul 19, 2023
Maintainer Author

rhashimoto
Dec 13, 2022
Maintainer Author

randName
Jan 4, 2024

rhashimoto Jan 4, 2024
Maintainer Author