-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple database wrappers #1748
Simple database wrappers #1748
Conversation
6460820
to
3a97652
Compare
@longjon I took a quick pass, and although the interface is cleaner, I'm concerned that the management of cursors and transactions for LMDB could create problems. I'm not sure if the proposed way of creating a new lmdb_txn for each cursor will cause problems, specially since they are never closed. Also creating a new transaction for each batch implies creating a new lmdb_txn that it not clearly closed. As it is now one could try to write while READ, or try to read while NEW. Have you done some stress test? I mean trying to train a ImageNet like model? Have you tried that you can train and test on the same lmdb? |
virtual void Next() = 0; | ||
virtual string key() = 0; | ||
virtual string value() = 0; | ||
virtual bool valid() = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should Valid()
to align better with leveldb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with valid
since it's an accessor and we generally lowercase those, but presumably LevelDB is also following the Google C++ style guide, so I could see the argument either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case for consistency with leveldb I would vote for Valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with lowercase valid by the accessor rule. No need to match LevelDB exactly since this is an interface over DBs.
Thanks for your apt comments @sguada, I've fixed the sloppiness as noted. Re: writing after READ, LMDB is opened read-only, I don't immediately see a corresponding option for LevelDB. We could enforce that in the wrapper or not; in this patch I've tried to provide nothing more than the functionality in master, and master of course doesn't enforce that. For LevelDB, READ still has the semantics of "don't create if missing". Re: reading after WRITE or NEW, that seems fine to me, just as files are usually opened read-only or read/write. Re: stress testing, I haven't done it yet, but can do. |
I plan to merge this shortly, so let me know if there are any objections. @longjon @sguada I stress tested this with 8x K40s doing simultaneous CaffeNet train and test on the same LMDB at a rate of 1 iter / s each. I left a CaffeNet solver running to check for leaks or other issues and there's no problem at 30,000+ iterations. |
After discussing with @shelhamer, we'd like to merge a simplified version of #1568 very soon, before addressing #1743, and in line with the discussion at #1238. To accelerate this process, I created this pared-down version of that PR. (History has been heavily rewritten, but keeps @sguada's authorship.)
This PR is meant to instantiate the original conception of #1238, i.e., a simple DB wrapper class that abstracts LevelDB vs LMDB without switch statements, and nothing more. The remaining functionality of #1568 may be rebased for continuing consideration.
Notes:
caffe::db
as suggested in Datum db #1568 (comment).Cursor
s (and onlyCursor
s), and writes (and only writes) fromTransaction
s and (onlyTransaction
s). The wrappers are uncoupled from protobuf.boost::scoped_ptr
in some instances whereunique_ptr
would be appropriate, but is disallowed by our avoidance of C++11.