-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AsyncIO support #86
Comments
Hi there, py-lmdb is not yet perfectly suitable for use in a single threaded, non-blocking mode for a variety of reasons, but see e.g. issue #65 - there is no ability to defer reads or writes to a worker pool without causing the GIL to be held for the duration of the IO. I'll think about updating the docs somehow, but generally even despite the aforementioned GIL issue, there are too many moving parts here to document a decisive "yes" or "no" answer. e.g. read-only txns over a smaller-than-RAM database could only be guaranteed not to block if the sysadmin carefully audited & disabled where necessary other programs making use of the host OS page cache. Even after this is done, ensuring a DB remains "smaller than RAM" depends less on dataset size than it does on the number/duration of read txns combined with write txns (current LMDB cannot reuse existing pages while any reader exists. Another option, at least for the read side, may be to mlock(2) LMDB's working set using the information returned by As for async writes, unless you're writing to tmpfs or have sync completely disabled, that's probably not a good idea. I imagine you'd always want write txns to run from a blocking thread. |
Hello, it's not very complicated to write a thin wrapper around lmdb to use it in async frameworks: just gotta defer real operations with the database in background threads, so that they dont block the main loop (hence my other issue about thread safety). What I really miss to do so is some kind of method to wait/block until there really is a value available for a given key. WIthout such method i doomed to do things like:
I hate that, it's too ugly ;) |
This seems to be the simplest, lowest overhead and most portable solution to the GIL page fault problem: simply loop over the MDB_val, touching one byte every 4kb. Regardless of whether the value lives in a leaf page or an overflow page, is 100 bytes or 64kb, this should ensure on all but the most heavily overloaded machines that the pages are in RAM before PyString_FromStringAndSize() attempts to copy them.
Is it possible to get clarification on either how to use this with AsyncIO, or if there is no special effort required, on the fact that it works.
Given how fast LMDB is, I'm sure a number of people will look at it with an eye towards non blocking asyncio usage.
The text was updated successfully, but these errors were encountered: