-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs #3810
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@al13n321 has updated the pull request. |
@@ -79,6 +79,7 @@ class BaseDeltaIterator : public Iterator { | |||
void Next() override { | |||
if (!Valid()) { | |||
status_ = Status::NotSupported("Next() on invalid iterator"); | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an this return
was intended here, because without it the Advance()
below would fail an assert.
@@ -61,7 +65,7 @@ class InternalIterator : public Cleanable { | |||
// Return the value for the current entry. The underlying storage for | |||
// the returned slice is valid only until the next modification of | |||
// the iterator. | |||
// REQUIRES: !AtEnd() && !AtStart() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where this came from. Very old code when iterator interface was different? Searching rocksdb code for "AtEnd" and "AtStart" returns no results, so I'm assuming they're not a thing.
@@ -528,11 +528,9 @@ class BlockBasedTableIterator : public InternalIterator { | |||
return data_block_iter_.value(); | |||
} | |||
Status status() const override { | |||
// It'd be nice if status() returned a const Status& instead of a Status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not particularly nice: Status
is 16 bytes, so it's probably faster to return by value (in two registers).
@@ -101,9 +101,7 @@ void ManagedIterator::SeekToLast() { | |||
} | |||
assert(mutable_iter_ != nullptr); | |||
mutable_iter_->SeekToLast(); | |||
if (mutable_iter_->status().ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This made no sense: if status is not ok, it would leave valid_
containing leftovers from previous iterator operations.
@@ -95,12 +95,27 @@ class ForwardLevelIterator : public InternalIterator { | |||
return valid_; | |||
} | |||
void SeekToFirst() override { | |||
SetFileIndex(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, Seek() seeks within the current file, which is set using SetFileIndex() called from the outside. But SeekToFirst() used to work differently and call SetFileIndex(0) for some reason, despite the fact that the caller calls SetFileIndex() before SeekToFirst(), just like it does before Seek(). So I changed SeekToFirst() to work the same way as Seek(). It doesn't affect correctness afaict, just a cleanup.
@al13n321 has updated the pull request. |
@al13n321 has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this change. It's much more complicated than I thought. It's a good job!
I just briefly looked at db_iter.cc. I'm still looking at other files.
db/db_iter.cc
Outdated
|
||
ParsedInternalKey ikey; | ||
|
||
bool DBIter::PrevInternal() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where the return value is used.
// previous key. | ||
if (!iter_->Valid()) { | ||
iter_->SeekToLast(); | ||
range_del_agg_.InvalidateTombstoneMapPositions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea removing this statement is correct or not. CC @ajkr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understood from comments in range_del_aggregator.h
, this method is only an optimization and doesn't affect correctness, and it should be called when the iterator is seeked to an arbitrary position. Here it's seeked to a position very close to where it used to be, so I guessed that InvalidateTombstoneMapPositions()
is not needed. @ajkr , is any of that right?
} | ||
iter_->Prev(); | ||
FindParseableKey(&ikey, kReverse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By removing this for-loop, it becomes complicated for me to understand.
Now my understanding is that, we can still rely on FindPrevUserKey() to do this for-loop and find the first key smaller than saved_key_
.
If that is the case, the function name FindPrevUserKey()
is confusing to me. It actually finds the previous user key to saved_key_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually finds the previous user key to saved_key_?
Yes, a comment above FindPrevUserKey()
says: "Move backwards until the key smaller than saved_key_". This is similar to FindNextUserEntry()
with skipping=true
, which goes to a key strictly above saved_key_
. I can rename FindPrevUserKey()
to something like FindUserKeyBeforeSavedKey()
if you find it less confusing.
I removed the loop because it was pretty much a duplicate of the loop in FindPrevUserKey()
, only with >
instead of >=
. So, the logic used to be "move back until we see a good entry with key <= saved_key_, then move back until we see a good entry with key != saved_key_", which is just a more complicated way of saying "move back until we see a good entry with key < saved_key_".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the function has clear comment on the behavior, it is good then.
} else { | ||
iter_->Seek(last_key); | ||
if (!iter_->Valid() && iter_->status().ok()) { | ||
iter_->SeekToLast(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this SeekToLast()
logic to handle keys deleted by forward iterator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And yes, forward iterator doesn't support Prev() right now, so this is unreachable at the moment (except in the stress test the next commit adds). I made DBIter support it anyway because maybe we'll later add support for Prev() in forward iterator, or have some other situation when DBIter is used over a non-snapshot iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly looked at most of the non-test files. It's awesome! Thank you again for helping with this complicated project.
// apply next time the iterator moves. | ||
// Used for simulating ForwardIterator updating to a new version that doesn't | ||
// have some of the keys (e.g. after compaction with a filter). | ||
void Vanish(std::string _key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this scenario. To verify the correctness of the iterator, another scenario worth covering is to add is to insert a key between the current key and the next key, because some of the looping for the next key logic has been changed a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stress test in the next commit should cover this scenario.
db/db_range_del_test.cc
Outdated
@@ -809,7 +809,7 @@ TEST_F(DBRangeDelTest, TailingIteratorRangeTombstoneUnsupported) { | |||
// For L1+, iterators over files are created on-demand, so need seek | |||
iter->SeekToFirst(); | |||
} | |||
ASSERT_TRUE(iter->status().IsNotSupported()); | |||
ASSERT_TRUE(iter->status().IsNotSupported()) << i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing.
(It prints the value of i
if the assertion fails. I added it for debugging when this test was failing, then left it because there's a small probability that it may come in handy for someone else later. Removing it, just to avoid touching an extra file without a good reason.)
child.SeekToLast(); | ||
} | ||
considerStatus(child.status()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find where the merging iterator is invalidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this commit doesn't invalidate iterator, only makes status()
more correct and fast. The next commit makes Valid()
return false if status is not ok.
child.SeekForPrev(key()); | ||
if (child.Valid() && comparator_->Equal(key(), child.key())) { | ||
child.SeekForPrev(target); | ||
considerStatus(child.status()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
child.Prev(); | ||
considerStatus(child.status()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here.
child.Next(); | ||
considerStatus(child.status()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here.
@al13n321 has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@al13n321 has updated the pull request. |
Alright, comments addressed, checks pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for helping with such a complicated task.
db/db_iter_stress_test.cc
Outdated
a /= 10; | ||
++len; | ||
} | ||
std::string s = std::to_string(rnd.Next() % (uint64_t)max_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ToString() in ./util/string_util.h. Same for other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing. There are lots of existing std::to_string
calls in db_iter_test.cc
; leaving them alone.
db/db_iter_stress_test.cc
Outdated
{ | ||
const char *env = getenv("TEST_TRACE"); | ||
if (env != nullptr && strlen(env) > 0 && strcmp(env, "0") != 0) { | ||
trace = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perf_context_test.cc uses parameter "--verbose" for this purpose. I hope we keep it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
db/db_iter_stress_test.cc
Outdated
std::cout | ||
<< "stats:\n exact matches: " << num_matching << "\n end reached: " | ||
<< num_at_end << "\n non-ok status: " << num_not_ok | ||
<< "\n mutated on the fly: " << num_recently_removed << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (verbose)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not: verbose
outputs ~80 MB of stuff, and is intended for debugging failures; in contrast, this cout
is just a few lines, to give an idea of how many times each case was hit.
db/db_iter_stress_test.cc
Outdated
else ASSERT_GT(db_iter->key().ToString(), old_key); | ||
} else { | ||
if (seek) ASSERT_LE(db_iter->key().ToString(), old_key); | ||
else ASSERT_LT(db_iter->key().ToString(), old_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run "make format" and see whether this is accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, make format
, I forgot it exists. Running it on the whole PR now.
@al13n321 has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@al13n321 has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@al13n321 has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@al13n321 has updated the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@al13n321 is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@al13n321 clang_analyze is still failing:
|
By the way, Google C++ Style (which we are following) disallow C-style casting. In this case |
#3872 will fix clang_analyze |
Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are:
However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours).
This PR changes the convention to:
This does sacrifice the two use cases listed above, but @siying said it's ok.
Overview of the changes:
DBIteratorTest.NonBlockingIterationBugRepro
explains the scenario.To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types:
Iterators that didn't need changes:
Iterators with changes (see inline comments for details):
FindParseableKey()
), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back.assert(iter_->Valid())
after a seek, which is never a safe assumption.SetStatus()
withInvalidate()
to make sure non-ok BlockIter is always invalid.