Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with iterator and lifetimes #15

Open
winstonewert opened this issue Apr 29, 2020 · 6 comments
Open

Issue with iterator and lifetimes #15

winstonewert opened this issue Apr 29, 2020 · 6 comments

Comments

@winstonewert
Copy link

Consider the implementation of next() for Iterator

fn next(&mut self) -> Option<Self::Item> {
        if self.is_valid() {
            let k = self.key();
            let v = self.value();
            self.next();
            Some((k, v))
        } else {
            None
        }
    }

The self.next() line invalidates the returned references from self.key() and self.value(). Thus this iterator is always returning bad references.

But why didn't Rust catch this?

pub fn key(&self) -> &'a [u8] {

key (and value) claims to return a reference with lifetime 'a. But this isn't really accurate, as it should only be valid until the next &mut call on self (probably on .next()). Rust doesn't catch the problem because of the unsafe code used to implement the function (obviously unavoidable.)

@bh1xuw
Copy link
Owner

bh1xuw commented May 2, 2020

Haven't figured out how to express the lifetime restrictions of key & value in Rust.

Is ReadOptions::pin_data solves?

  // Keep the blocks loaded by the iterator pinned in memory as long as the
  // iterator is not deleted, If used when reading from tables created with
  // BlockBasedTableOptions::use_delta_encoding = false,
  // Iterator's property "rocksdb.iterator.is-key-pinned" is guaranteed to
  // return 1.
  // Default: false
  bool pin_data;

@winstonewert
Copy link
Author

It seems to me that key() and value() ought to be simply like:

pub fn key(&self) -> &[u8] {

The references remain valid until the iterator is modified by calling .next()

pin_data looks like it would solve the problem for keys, but at the cost of keeping all the keys in memory which wouldn't work in my case since I'm iterating over a large table and really don't want to keep all those keys in memory. And it wouldn't solve for value() in any case.

I think the problem here is that iterator doesn't allow you to return references that only valid until .next() is called.

bh1xuw added a commit that referenced this issue May 6, 2020
bh1xuw added a commit that referenced this issue May 8, 2020
@bh1xuw
Copy link
Owner

bh1xuw commented May 11, 2020

The next() should be

fn next<'k, 'v>(&'a mut self) -> Option<(&'k [u8], &'v [u8])>

When pin_data enabled, 'k is the same as 'a.
Or else 'k and 'v live just before next next() call.

Haven't figure out how to express the lifetime restriction in Rust. 🤣

@bh1xuw
Copy link
Owner

bh1xuw commented May 15, 2020

Workaround:

In for _ in it {} blocks: nothing changed.

While collecting for later use:

// `it` here, for later use, you must hold the original `mut it` somewhere.
it.map(|(key, val)| (key.to_vec(), val.to_vec())).collect::<Vec<_>>();

@winstonewert
Copy link
Author

In my case, that would load the whole multi-gb datastore into memory, not a good plan.

A better workaround is to use a while loop with is_valid()

bh1xuw added a commit that referenced this issue May 15, 2020
@bh1xuw
Copy link
Owner

bh1xuw commented May 15, 2020

I will add a link to this issue in README.
Many thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants