-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dupsort support #182
Dupsort support #182
Conversation
46b599f
to
15e36b3
Compare
15e36b3
to
f345da7
Compare
To sum up what we said earlier today: Dbi type aliastype Dbi = u32; Update EnvInner accordingly. Flags
DatabaseBuilderTo avoid duplicating create/open xxx database methods. See Builder pattern. One method per supported flag, pub struct DatabaseBuilder<'e> {
env: &'e Env,
name: Option<String>,
type: Option<(typeId, TypeId)>,
database_flags: DatabaseFlags
} Duplicate keys support
|
b00078f
to
729ba78
Compare
I unimplemented put reserved as it seems MDB_RESERVED seems incompatible with MBD_DUPSORT
|
The as uniform won't be as straightforward as there is a PolyDatabase behind the hood. On the spot the solution I think about is to:
pub struct TypedDatabase<KC, DC, DB> where DB: Database {
pub dyndb: DB,
marker: std::marker::PhantomData<(KC, DC)>,
} And find a better name for PolyMultiDatabase |
788c91c
to
c680c74
Compare
I don't think it's a good idea to leak duplicate key support everywhere.
|
What's the issue? Note that I wanted to delete the
We could return an error in this particular case with a user-friendly error message.
I am not sure of the exact difference. Is
Indeed, but that is not something that we will handle differently than with a runtime error. Do we have to test the length of the keys by ourselves? If so, we can do it only when |
Hello, We did some tests with our data and this branch and we ran into the limits of lmdb's dupsort. In a nutshell, when dupsort is enabled the values are keys of a sub DB, hence the same limitations applies than for regular keys. Namely it worked until we encountered a MDB_BAD_VALSIZE because some values are bigger than mdb_env_get_maxkeysize, 511 bytes https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#L275-L286 So we won't be able to use the dupsort feature. As for your previous comment:
I still think that it's necessary to have to separated apis for dupsort and not dupsort databases. There won't be any conversion method to implement between them so it's ok.
👍
Lines 1710 to 1714 in ad6766f
This part shows the difference in behavior. I don't think we could do that as the behavior depends on the flag the database was opened with.
This would need a if at each call for everyone, which can be mitigated with a dupsort specific database. I'd define all mutable iteration methods as unsafe, and in the safety section specify that keys must be of the same length, and let the user take their own risks. |
Hey @AureliaDolo 👋
Sad to hear this too! We will not be able to use the Weren't you using bdb (BerkeleyDB) before? Isn't it already the case that the values are limited to 511 bytes?
It seems that redb doesn't support multi-value but I advise you to look at sanakirja, much harder to use but supports multi-values in a much better (internal) way and has better performances than LMDB.
Why can't we return a
Cannot the
A runtime
Isn't LMDB checking for the length already? |
From BDB ref manual
and
Yeah, there was no limitation (BDB is not bound to mmap semantics), it's not an issue we will find a way that avoid key duplication, will not ask lmdb what it cannot do. But the Lot of people do "pascal-esque byte-string inside the value" like Powerdns or Knot to circumvent the limitation, it's just costly in terms of IO-write if you don't know what you need to write at once.
Yeah it is hence the About
Looks nice but the crash free (for our kind of workload) feature of lmdb in a safe-mode is really pristine! For the performances it's also fast. :) But will look definitely! Redb has the https://docs.rs/redb/latest/redb/struct.MultimapTable.html but have to check-it-out. https://github.com/cberner/redb/blob/master/tests/multimap_tests.rs#L40 |
I was just looking for the delete key and value and came back here to mention I couldn't find it. |
Superseded by #200 and therefore closed. |
Fixes #41
Previous work on this issue #59
I chose to pass flags to Env::raw_init_database to reuse it in [create/open]_poly_multi_database.
The only difference between PolyDatabase and PolyMultiDatabase is the return type of get : it's a RoRange for PolyMultiDatabase.
For now, I added the case of a duplicate key only in the get method. If you're ok with the core of this PR, I plan to add that in the other doctests, especially the one that documents the PolyMultiDatabase struct.
Tests
Comparison methods
I'm not sure what exactly should these methods return, especially for get greater than. If we keep returning just a key value, we can use the get to have the iterator, or return an iterator on the next key. For get_greater_than what if the key we check against has multiple values ? For last we should return the last value of the last key ?
Moreover we could add equivalent for those methods that take both a key and a value.
set related methods
Len should return total number of key value not just number of key.
iterators
I see no change in behavior here. Maybe add in the doc that prefix iter on a key that has multiple value should be equivalent to getting that key ?
modification
Should delete delete all values for the key ? Should we add a delete_key_value that takes a key and a value ?
conversion