-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VFSFile iVersion 3 methods, version 2 passthrough #418
Comments
My use-case for this is that I want to track all writes to the main database file so that I can keep it in sync with a remote copy. I think this means that I can't allow any shared memory use. At the same time, I would otherwise like to forward all operations to the default VFSFile implementation. I think Shm is used only for the WAL file (not the main database file) so this should work fine in practice. However, I would feel a lot safer if attempts to call xShm* would fail loudly. It would be great if there was a way to do that. Would it be feasible to e.g. make a |
How strictly do you want to track writes? For example do you want to block local writes until remote is in sync, or do you just need to know something changed so you can eventually get around to it? If your use case is on the strict side, then there are already a variety of solutions out there like SQLiteCloud. It is correct that you can't detect writes when shared memory is in use. (Technically you could by mprotect the area, have a signal handler to detect writes, and similar expensive schemes). I will add an |
I want to replicate writes asynchronously, so no need to block. Would setting (Not sure what |
There is a small combinatorial problem due to 3 sets of methods and wanting some NULL, so that would be part of exactly how many parameters there are. xFetch looks like a way of you owning the in memory storage. Regular xRead requires you to copy the data into a buffer SQLite provides. xFetch lets you return a pointer, avoiding that copy. The current SQLite VFS implementations only implement xFetch if mmap is enabled. But even that has issues - if the file size has changed then mremap can change the address of the mapping. The VFS keep a reference count of Fetch/UnFetch calls and only does mremap if the outstanding count is zero. It does look like there is no sense in making it possible to implement the iVersion 2 & 3 calls in Python. If you only need loose tracking, it would seem that a VFS approach is way overkill. Couldn't you just periodically poll the last modify timestamp on the database files and sync on those changing? There is a also a tracing vfs. |
I need to know which specific parts of the database are changing, so that I don't needlessly upload the entire file. So mtime doesn't quite do it. I could use tracing_vfs, but to me that seems like overkill (why parse text messages for all VFS operations?) |
I don't know the specifics of your requirements, but I'd tend to go for a simpler more robust solution using rsync to transfer the files on change. rsync does per block checksums and then transfers changed blocks only. It would also handle the case of data moving from wal to the main file since the checksum would remain the same so no need to transfer copies of that block. An inotify style hook would then invoke it as needed. I wouldn't expect you to use the tracing vfs as is, but rather hacking it down to exactly what you need in the most convenient way. No matter what, the issue description currently had my thinking on what will and won't be implemented and i believe it will also work for your needs. |
I'm confused about the tracing VFS idea. You're saying I should write a custom C VFS (based on tracing VFS) instead of doing this in Python through APSW? (I can't use rsync because I'm dealing with a dumb REST backend, not a server under my control) |
APSW has to receive parameters in C, convert them to Python, convert back to C again to call the VFS being inherited from, then go through the conversions again with the result values and out parameters. It is doing a lot of work, when all you wanted to know was modified ranges! Hacking down the tracing vfs would leave you with a module having little footprint or code getting only the information you need. It is a shame you can't update the REST backend. If it at least gave checksums for blocks then you could do the syncing without having to mess with VFS, and it would be far more resilient against transient network issues etc. |
Context: https://groups.google.com/g/python-sqlite/c/IIpnmLGyhrE
Items to fix:
Items to not do:
The text was updated successfully, but these errors were encountered: