Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncIO support #86

Closed
techdragon opened this issue Apr 26, 2015 · 2 comments
Closed

AsyncIO support #86

techdragon opened this issue Apr 26, 2015 · 2 comments

Comments

@techdragon
Copy link

Is it possible to get clarification on either how to use this with AsyncIO, or if there is no special effort required, on the fact that it works.

Given how fast LMDB is, I'm sure a number of people will look at it with an eye towards non blocking asyncio usage.

@dw
Copy link
Collaborator

dw commented Apr 26, 2015

Hi there,

py-lmdb is not yet perfectly suitable for use in a single threaded, non-blocking mode for a variety of reasons, but see e.g. issue #65 - there is no ability to defer reads or writes to a worker pool without causing the GIL to be held for the duration of the IO.

I'll think about updating the docs somehow, but generally even despite the aforementioned GIL issue, there are too many moving parts here to document a decisive "yes" or "no" answer. e.g. read-only txns over a smaller-than-RAM database could only be guaranteed not to block if the sysadmin carefully audited & disabled where necessary other programs making use of the host OS page cache. Even after this is done, ensuring a DB remains "smaller than RAM" depends less on dataset size than it does on the number/duration of read txns combined with write txns (current LMDB cannot reuse existing pages while any reader exists.

Another option, at least for the read side, may be to mlock(2) LMDB's working set using the information returned by Environment.info(), but that requires yet more knowledge of how the host OS, and at least on Linux, probably also requires the database file to be fully allocated ahead of time.

As for async writes, unless you're writing to tmpfs or have sync completely disabled, that's probably not a good idea. I imagine you'd always want write txns to run from a blocking thread.

@stephane-martin
Copy link

Hello,

it's not very complicated to write a thin wrapper around lmdb to use it in async frameworks: just gotta defer real operations with the database in background threads, so that they dont block the main loop (hence my other issue about thread safety).

What I really miss to do so is some kind of method to wait/block until there really is a value available for a given key. WIthout such method i doomed to do things like:

while db.begin().get(key) is None:
    time.sleep(1)

I hate that, it's too ugly ;)

dw added a commit that referenced this issue Jun 6, 2015
This seems to be the simplest, lowest overhead and most portable
solution to the GIL page fault problem: simply loop over the MDB_val,
touching one byte every 4kb. Regardless of whether the value lives in a
leaf page or an overflow page, is 100 bytes or 64kb, this should ensure
on all but the most heavily overloaded machines that the pages are in
RAM before PyString_FromStringAndSize() attempts to copy them.
@dw dw closed this as completed in 76c1fbb Jun 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants