Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dupsort support #182

Closed
wants to merge 5 commits into from
Closed

Conversation

AureliaDolo
Copy link
Contributor

@AureliaDolo AureliaDolo commented Jul 11, 2023

Fixes #41

Previous work on this issue #59

I chose to pass flags to Env::raw_init_database to reuse it in [create/open]_poly_multi_database.

The only difference between PolyDatabase and PolyMultiDatabase is the return type of get : it's a RoRange for PolyMultiDatabase.

For now, I added the case of a duplicate key only in the get method. If you're ok with the core of this PR, I plan to add that in the other doctests, especially the one that documents the PolyMultiDatabase struct.

Tests

Comparison methods

I'm not sure what exactly should these methods return, especially for get greater than. If we keep returning just a key value, we can use the get to have the iterator, or return an iterator on the next key. For get_greater_than what if the key we check against has multiple values ? For last we should return the last value of the last key ?
Moreover we could add equivalent for those methods that take both a key and a value.

  • get_lower_than
  • get_lower_than_or_equal_to
  • get_greater_than
  • get_greater_than_or_equal_to
  • first
  • last

set related methods

Len should return total number of key value not just number of key.

  • len
  • is_empty

iterators

I see no change in behavior here. Maybe add in the doc that prefix iter on a key that has multiple value should be equivalent to getting that key ?

  • iter
  • iter_mut
  • rev_iter
  • rev_iter_mut
  • range
  • range_mut
  • rev_range
  • rev_range_mut
  • prefix_iter
  • prefix_iter_mut
  • rev_prefix_iter
  • rev_prefix_iter_mut

modification

Should delete delete all values for the key ? Should we add a delete_key_value that takes a key and a value ?

  • put
  • put_reserved
  • append
  • delete
  • delete_range
  • clear

conversion

  • as_uniform

heed/src/db/multi.rs Outdated Show resolved Hide resolved
@AureliaDolo AureliaDolo changed the title Dupsort support Draft: Dupsort support Jul 11, 2023
@Kerollmops Kerollmops marked this pull request as draft July 11, 2023 15:26
@Kerollmops Kerollmops changed the title Draft: Dupsort support Dupsort support Jul 11, 2023
@Kerollmops Kerollmops added this to the v0.20.0 milestone Jul 12, 2023
@AureliaDolo
Copy link
Contributor Author

AureliaDolo commented Jul 12, 2023

To sum up what we said earlier today:

Dbi type alias

type Dbi = u32;

Update EnvInner accordingly.

Flags

  • rename existing flags to EnvFlags
  • make them real bitflags
  • add DatabaseFlags as bitflags

DatabaseBuilder

To avoid duplicating create/open xxx database methods. See Builder pattern. One method per supported flag,

pub struct DatabaseBuilder<'e> {
    env: &'e Env,
    name: Option<String>,
    type: Option<(typeId, TypeId)>, 
    database_flags: DatabaseFlags
}

Duplicate keys support

  • new specific iterator to iterate on one duplicate key
  • add a boolean to existing iterators to determine if we iterate on data items or unique keys

@AureliaDolo
Copy link
Contributor Author

I unimplemented put reserved as it seems MDB_RESERVED seems incompatible with MBD_DUPSORT

MDB_RESERVED - [...] This flag must not be specified if the database was opened with #MDB_DUPSORT.

@AureliaDolo
Copy link
Contributor Author

The as uniform won't be as straightforward as there is a PolyDatabase behind the hood.

On the spot the solution I think about is to:

  • rename Database to TypedDatabase
  • have a trait Database that will have all the database methods
  • both PolyDatabase and PolyMultiDatabase implement Database
pub struct TypedDatabase<KC, DC, DB> where DB:  Database {
    pub dyndb: DB,
    marker: std::marker::PhantomData<(KC, DC)>,
}

And find a better name for PolyMultiDatabase

@AureliaDolo AureliaDolo mentioned this pull request Jul 13, 2023
2 tasks
@AureliaDolo AureliaDolo mentioned this pull request Jul 13, 2023
2 tasks
@AureliaDolo
Copy link
Contributor Author

Duplicate keys support

  • new specific iterator to iterate on one duplicate key
  • add a boolean to existing iterators to determine if we iterate on data items or unique keys

I don't think it's a good idea to leak duplicate key support everywhere.

  • put_reserved is not supported
  • MDB_APPENDDUP behaves differently than MDB_APPEND
  • harder constraint on put during a mutable iteration: the key must be the same size.

@Kerollmops
Copy link
Member

The PolyDatabase::as_uniform won't be as straightforward as there is a PolyDatabase behind the hood.

What's the issue? Note that I wanted to delete the PolyDatabase type from the crate and only have a Database<K, V> type, maybe. Would it help?

  • put_reserved is not supported

We could return an error in this particular case with a user-friendly error message.

  • MDB_APPENDDUP behaves differently than MDB_APPEND

I am not sure of the exact difference. Is MDB_APPENDDUP used to append new data to a specific key? We should introduce a new Database::append_dup(&self, wtxn, key) method that accepts the key on which to append a dup data.

  • harder constraint on put during a mutable iteration: the key must be the same size.

Indeed, but that is not something that we will handle differently than with a runtime error. Do we have to test the length of the keys by ourselves? If so, we can do it only when DUPSORT is enabled.

@AureliaDolo
Copy link
Contributor Author

Hello,

We did some tests with our data and this branch and we ran into the limits of lmdb's dupsort. In a nutshell, when dupsort is enabled the values are keys of a sub DB, hence the same limitations applies than for regular keys.

Namely it worked until we encountered a MDB_BAD_VALSIZE because some values are bigger than mdb_env_get_maxkeysize, 511 bytes

https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#L275-L286
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L650-L668

So we won't be able to use the dupsort feature.

As for your previous comment:

Note that I wanted to delete the PolyDatabase type from the crate and only have a Database<K, V> type, maybe. Would it help?

I still think that it's necessary to have to separated apis for dupsort and not dupsort databases. There won't be any conversion method to implement between them so it's ok.

We could return an error in this particular case with a user-friendly error message.

👍

I am not sure of the exact difference. Is MDB_APPENDDUP used to append new data to a specific key? We should introduce a new Database::append_dup(&self, wtxn, key) method that accepts the key on which to append a dup data.

heed/heed/src/db/multi.rs

Lines 1710 to 1714 in ad6766f

/// // But as opposed to the behavior without DUPSORT,
/// // you will still be able to insert everywhere
/// // as long as you keep the order inside a key
/// db.append::<BEI32, Str>(&mut wtxn, &1, "i-am-the-one")?;
/// db.append::<BEI32, Str>(&mut wtxn, &1, "i-am-the-other-one")?;

This part shows the difference in behavior. I don't think we could do that as the behavior depends on the flag the database was opened with.

Indeed, but that is not something that we will handle differently than with a runtime error. Do we have to test the length of the keys by ourselves? If so, we can do it only when DUPSORT is enabled.

This would need a if at each call for everyone, which can be mitigated with a dupsort specific database. I'd define all mutable iteration methods as unsafe, and in the safety section specify that keys must be of the same length, and let the user take their own risks.

@Kerollmops
Copy link
Member

Kerollmops commented Jul 18, 2023

Hey @AureliaDolo 👋

Namely it worked until we encountered a MDB_BAD_VALSIZE because some values are bigger than mdb_env_get_maxkeysize, 511 bytes

Sad to hear this too! We will not be able to use the DUPSORT flag in Meilisearch if this default limit of 511 bytes also applies to the values...

Weren't you using bdb (BerkeleyDB) before? Isn't it already the case that the values are limited to 511 bytes?

So we won't be able to use the dupsort feature.

It seems that redb doesn't support multi-value but I advise you to look at sanakirja, much harder to use but supports multi-values in a much better (internal) way and has better performances than LMDB.

I still think that it's necessary to have to separated apis for dupsort and not dupsort databases. There won't be any conversion method to implement between them so it's ok.

Why can't we return a Database<ByteSlice, ByteSlice> by default and let the user use the Database::remap method to change the key/data codec?

heed/heed/src/db/multi.rs

Lines 1710 to 1714 in ad6766f

/// // But as opposed to the behavior without DUPSORT,
/// // you will still be able to insert everywhere
/// // as long as you keep the order inside a key
/// db.append::<BEI32, Str>(&mut wtxn, &1, "i-am-the-one")?;
/// db.append::<BEI32, Str>(&mut wtxn, &1, "i-am-the-other-one")?;

This part shows the difference in behavior. I don't think we could do that as the behavior depends on the flag the database was opened with.

Cannot the Database::append method always be used to append a new key/data at the end of the database and the new-to-be introduced Database::append_dup method be always used to append a new data associated with the specified key?

This would need a if at each call for everyone, which can be mitigated with a dupsort specific database. I'd define all mutable iteration methods as unsafe, and in the safety section specify that keys must be of the same length, and let the user take their own risks.

A runtime if is nothing, especially when it will always be the same result. We already do a lot of security verifications.

[...] and in the safety section specify that keys must be of the same length, and let the user take their own risks.

Isn't LMDB checking for the length already?

@darnuria
Copy link
Contributor

darnuria commented Jul 19, 2023

Weren't you using bdb (BerkeleyDB) before? Isn't it already the case that the values are limited to 511 bytes?

From BDB ref manual
https://docs.oracle.com/database/bdb181/html/api_reference/C/dbset_flags.html

DB_DUPSORT

Permit duplicate data items in the database; that is, insertion when the key of the key/data pair being inserted already exists in the database will be successful. The ordering of duplicates in the database is determined by the duplicate comparison function. If the application does not specify a comparison function using the DB->set_dup_compare() method, a default lexical comparison will be used. It is an error to specify both DB_DUPSORT and DB_RECNUM.

Calling DB->set_flags() with the DB_DUPSORT flag affects the database, including all threads of control accessing the database.

If the database already exists when DB->open() is called, the DB_DUPSORT flag must be the same as the existing database or an error will be returned.

and
https://docs.oracle.com/database/bdb181/html/api_reference/C/dbset_flags.html

 If the database supports sorted duplicates, the new data value is inserted at the correct sorted location. 

Yeah, there was no limitation (BDB is not bound to mmap semantics), it's not an issue we will find a way that avoid key duplication, will not ask lmdb what it cannot do.

But the MDB_DUPSORT within it's limitation is nice! Just part of the data don't fit. :)

Lot of people do "pascal-esque byte-string inside the value" like Powerdns or Knot to circumvent the limitation, it's just costly in terms of IO-write if you don't know what you need to write at once.

Isn't LMDB checking for the length already?

Yeah it is hence the MDB_BAD_VALSIZE returned, it returns at the insert.

About sanakirja

It seems that redb doesn't support multi-value but I advise you to look at sanakirja, much harder to use but supports multi-values in a much better (internal) way and has better performances than LMDB.

Looks nice but the crash free (for our kind of workload) feature of lmdb in a safe-mode is really pristine! For the performances it's also fast. :)

But will look definitely!

Redb has the https://docs.rs/redb/latest/redb/struct.MultimapTable.html but have to check-it-out.

https://github.com/cberner/redb/blob/master/tests/multimap_tests.rs#L40

@mikedilger
Copy link

I was just looking for the delete key and value and came back here to mention I couldn't find it.

@Kerollmops
Copy link
Member

Superseded by #200 and therefore closed.

@Kerollmops Kerollmops closed this Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support duplicate keys
4 participants