Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs #3810

al13n321 · 2018-05-05T03:06:38Z

Before this PR, Iterator/InternalIterator may simultaneously have non-ok status() and Valid() = true. That state means that the last operation failed, but the iterator is nevertheless positioned on some unspecified record. Likely intended uses of that are:

If some sst files are corrupted, a normal iterator can be used to read the data from files that are not corrupted.
When using read_tier = kBlockCacheTier, read the data that's in block cache, skipping over the data that is not.

However, this behavior wasn't documented well (and until recently the wiki on github had misleading incorrect information). In the code there's a lot of confusion about the relationship between status() and Valid(), and about whether Seek()/SeekToLast()/etc reset the status or not. There were a number of bugs caused by this confusion, both inside rocksdb and in the code that uses rocksdb (including ours).

This PR changes the convention to:

If status() is not ok, Valid() always returns false.
Any seek operation resets status. (Before the PR, it depended on iterator type and on particular error.)

This does sacrifice the two use cases listed above, but @siying said it's ok.

Overview of the changes:

A commit that adds missing status checks in MergingIterator. This fixes a bug that actually affects us, and we need it fixed. DBIteratorTest.NonBlockingIterationBugRepro explains the scenario.
Changes to lots of iterator types to make all of them conform to the new convention. Some bug fixes along the way. By far the biggest changes are in DBIter, which is a big messy piece of code; I tried to make it less big and messy but mostly failed.
A stress-test for DBIter, to gain some confidence that I didn't break it. It does a few million random operations on the iterator, while occasionally modifying the underlying data (like ForwardIterator does) and occasionally returning non-ok status from internal iterator.

To find the iterator types that needed changes I searched for "public .*Iterator" in the code. Here's an overview of all 27 iterator types:

Iterators that didn't need changes:

status() is always ok(), or Valid() is always false: MemTableIterator, ModelIter, TestIterator, KVIter (2 classes with this name anonymous namespaces), LoggingForwardVectorIterator, VectorIterator, MockTableIterator, EmptyIterator, EmptyInternalIterator.
Thin wrappers that always pass through Valid() and status(): ArenaWrappedDBIter, TtlIterator, InternalIteratorFromIterator.

Iterators with changes (see inline comments for details):

DBIter - an overhaul:
- It used to silently skip corrupted keys (FindParseableKey()), which seems dangerous. This PR makes it just stop immediately after encountering a corrupted key, just like it would for other kinds of corruption. Let me know if there was actually some deeper meaning in this behavior and I should put it back.
- It had a few code paths silently discarding subiterator's status. The stress test caught a few.
- The backwards iteration code path was expecting the internal iterator's set of keys to be immutable. It's probably always true in practice at the moment, since ForwardIterator doesn't support backwards iteration, but this PR fixes it anyway. See added DBIteratorTest.ReverseToForwardBug for an example.
- Some parts of backwards iteration code path even did things like assert(iter_->Valid()) after a seek, which is never a safe assumption.
- It used to not reset status on seek for some types of errors.
- Some simplifications and better comments.
- Some things got more complicated from the added error handling. I'm open to ideas for how to make it nicer.
MergingIterator - check status after every operation on every subiterator, and in some places assert that valid subiterators have ok status.
ForwardIterator - changed to the new convention, also slightly simplified.
ForwardLevelIterator - fixed some bugs and simplified.
LevelIterator - simplified.
TwoLevelIterator - changed to the new convention. Also fixed a bug that would make SeekForPrev() sometimes silently ignore errors from first_level_iter_.
BlockBasedTableIterator - minor changes.
BlockIter - replaced SetStatus() with Invalidate() to make sure non-ok BlockIter is always invalid.
PlainTableIterator - some seeks used to not reset status.
CuckooTableIterator - tiny code cleanup.
ManagedIterator - fixed some bugs.
BaseDeltaIterator - changed to the new convention and fixed a bug.
BlobDBIterator - seeks used to not reset status.
KeyConvertingIterator - some small change.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2018-05-05T03:22:58Z

@al13n321 has updated the pull request.

al13n321 · 2018-05-05T03:28:20Z

utilities/write_batch_with_index/write_batch_with_index.cc

@@ -79,6 +79,7 @@ class BaseDeltaIterator : public Iterator {
  void Next() override {
    if (!Valid()) {
      status_ = Status::NotSupported("Next() on invalid iterator");
+      return;


I think an this return was intended here, because without it the Advance() below would fail an assert.

al13n321 · 2018-05-05T03:32:47Z

table/internal_iterator.h

@@ -61,7 +65,7 @@ class InternalIterator : public Cleanable {
  // Return the value for the current entry.  The underlying storage for
  // the returned slice is valid only until the next modification of
  // the iterator.
-  // REQUIRES: !AtEnd() && !AtStart()


Not sure where this came from. Very old code when iterator interface was different? Searching rocksdb code for "AtEnd" and "AtStart" returns no results, so I'm assuming they're not a thing.

al13n321 · 2018-05-05T03:34:47Z

table/block_based_table_reader.h

@@ -528,11 +528,9 @@ class BlockBasedTableIterator : public InternalIterator {
    return data_block_iter_.value();
  }
  Status status() const override {
-    // It'd be nice if status() returned a const Status& instead of a Status


Not particularly nice: Status is 16 bytes, so it's probably faster to return by value (in two registers).

al13n321 · 2018-05-05T03:41:12Z

db/managed_iterator.cc

@@ -101,9 +101,7 @@ void ManagedIterator::SeekToLast() {
  }
  assert(mutable_iter_ != nullptr);
  mutable_iter_->SeekToLast();
-  if (mutable_iter_->status().ok()) {


This made no sense: if status is not ok, it would leave valid_ containing leftovers from previous iterator operations.

al13n321 · 2018-05-05T03:48:59Z

db/forward_iterator.cc

@@ -95,12 +95,27 @@ class ForwardLevelIterator : public InternalIterator {
    return valid_;
  }
  void SeekToFirst() override {
-    SetFileIndex(0);


Here, Seek() seeks within the current file, which is set using SetFileIndex() called from the outside. But SeekToFirst() used to work differently and call SetFileIndex(0) for some reason, despite the fact that the caller calls SetFileIndex() before SeekToFirst(), just like it does before Seek(). So I changed SeekToFirst() to work the same way as Seek(). It doesn't affect correctness afaict, just a cleanup.

facebook-github-bot · 2018-05-07T22:32:56Z

@al13n321 has updated the pull request.

facebook-github-bot · 2018-05-08T02:28:24Z

@al13n321 has updated the pull request.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

siying

Thank you so much for this change. It's much more complicated than I thought. It's a good job!

I just briefly looked at db_iter.cc. I'm still looking at other files.

siying · 2018-05-08T16:20:57Z

db/db_iter.cc

-
-  ParsedInternalKey ikey;
-
+bool DBIter::PrevInternal() {


I don't see where the return value is used.

siying · 2018-05-08T16:32:04Z

db/db_iter.cc

-    // previous key.
-    if (!iter_->Valid()) {
-      iter_->SeekToLast();
-      range_del_agg_.InvalidateTombstoneMapPositions();


I have no idea removing this statement is correct or not. CC @ajkr

As far as I understood from comments in range_del_aggregator.h, this method is only an optimization and doesn't affect correctness, and it should be called when the iterator is seeked to an arbitrary position. Here it's seeked to a position very close to where it used to be, so I guessed that InvalidateTombstoneMapPositions() is not needed. @ajkr , is any of that right?

siying · 2018-05-08T16:34:42Z

db/db_iter.cc

      }
-      iter_->Prev();
-      FindParseableKey(&ikey, kReverse);


By removing this for-loop, it becomes complicated for me to understand.
Now my understanding is that, we can still rely on FindPrevUserKey() to do this for-loop and find the first key smaller than saved_key_.

If that is the case, the function name FindPrevUserKey() is confusing to me. It actually finds the previous user key to saved_key_?

It actually finds the previous user key to saved_key_?

Yes, a comment above FindPrevUserKey() says: "Move backwards until the key smaller than saved_key_". This is similar to FindNextUserEntry() with skipping=true, which goes to a key strictly above saved_key_. I can rename FindPrevUserKey() to something like FindUserKeyBeforeSavedKey() if you find it less confusing.

I removed the loop because it was pretty much a duplicate of the loop in FindPrevUserKey(), only with > instead of >=. So, the logic used to be "move back until we see a good entry with key <= saved_key_, then move back until we see a good entry with key != saved_key_", which is just a more complicated way of saying "move back until we see a good entry with key < saved_key_".

If the function has clear comment on the behavior, it is good then.

siying · 2018-05-08T16:58:02Z

db/db_iter.cc

+    } else {
+      iter_->Seek(last_key);
+      if (!iter_->Valid() && iter_->status().ok()) {
+        iter_->SeekToLast();


Is this SeekToLast() logic to handle keys deleted by forward iterator?

Yes. And yes, forward iterator doesn't support Prev() right now, so this is unreachable at the moment (except in the stress test the next commit adds). I made DBIter support it anyway because maybe we'll later add support for Prev() in forward iterator, or have some other situation when DBIter is used over a non-snapshot iterator.

siying

I briefly looked at most of the non-test files. It's awesome! Thank you again for helping with this complicated project.

siying · 2018-05-08T18:11:29Z

db/db_iter_test.cc

+  // apply next time the iterator moves.
+  // Used for simulating ForwardIterator updating to a new version that doesn't
+  // have some of the keys (e.g. after compaction with a filter).
+  void Vanish(std::string _key) {


Thank you for adding this scenario. To verify the correctness of the iterator, another scenario worth covering is to add is to insert a key between the current key and the next key, because some of the looping for the next key logic has been changed a little bit.

The stress test in the next commit should cover this scenario.

siying · 2018-05-08T18:19:45Z

db/db_range_del_test.cc

@@ -809,7 +809,7 @@ TEST_F(DBRangeDelTest, TailingIteratorRangeTombstoneUnsupported) {
      // For L1+, iterators over files are created on-demand, so need seek
      iter->SeekToFirst();
    }
-    ASSERT_TRUE(iter->status().IsNotSupported());
+    ASSERT_TRUE(iter->status().IsNotSupported()) << i;


I don't understand this.

Removing.

(It prints the value of i if the assertion fails. I added it for debugging when this test was failing, then left it because there's a small probability that it may come in handy for someone else later. Removing it, just to avoid touching an extra file without a good reason.)

siying · 2018-05-08T19:41:55Z

table/merging_iterator.cc

              child.SeekToLast();
            }
+            considerStatus(child.status());


I didn't find where the merging iterator is invalidated.

Oh, this commit doesn't invalidate iterator, only makes status() more correct and fast. The next commit makes Valid() return false if status is not ok.

siying · 2018-05-08T19:42:06Z

table/merging_iterator.cc

-            child.SeekForPrev(key());
-            if (child.Valid() && comparator_->Equal(key(), child.key())) {
+            child.SeekForPrev(target);
+            considerStatus(child.status());


siying · 2018-05-08T19:42:15Z

table/merging_iterator.cc

              child.Prev();
+              considerStatus(child.status());


siying · 2018-05-08T19:42:45Z

table/merging_iterator.cc

        child.Next();
+        considerStatus(child.status());


facebook-github-bot · 2018-05-10T02:53:46Z

@al13n321 has updated the pull request.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2018-05-10T23:38:48Z

@al13n321 has updated the pull request.

al13n321 · 2018-05-11T00:15:14Z

Alright, comments addressed, checks pass.

siying

Thank you for helping with such a complicated task.

siying · 2018-05-17T00:46:56Z

db/db_iter_stress_test.cc

+      a /= 10;
+      ++len;
+    }
+    std::string s = std::to_string(rnd.Next() % (uint64_t)max_key);


Use ToString() in ./util/string_util.h. Same for other places.

Changing. There are lots of existing std::to_string calls in db_iter_test.cc; leaving them alone.

siying · 2018-05-17T00:52:46Z

db/db_iter_stress_test.cc

+  {
+    const char *env = getenv("TEST_TRACE");
+    if (env != nullptr && strlen(env) > 0 && strcmp(env, "0") != 0) {
+      trace = true;


perf_context_test.cc uses parameter "--verbose" for this purpose. I hope we keep it consistent.

siying · 2018-05-17T00:53:17Z

db/db_iter_stress_test.cc

+  std::cout
+    << "stats:\n  exact matches: " << num_matching << "\n  end reached: "
+    << num_at_end << "\n  non-ok status: " << num_not_ok
+    << "\n  mutated on the fly: " << num_recently_removed << std::endl;


if (verbose)?

I'd rather not: verbose outputs ~80 MB of stuff, and is intended for debugging failures; in contrast, this cout is just a few lines, to give an idea of how many times each case was hit.

siying · 2018-05-17T00:54:37Z

db/db_iter_stress_test.cc

+                      else      ASSERT_GT(db_iter->key().ToString(), old_key);
+                    } else {
+                      if (seek) ASSERT_LE(db_iter->key().ToString(), old_key);
+                      else      ASSERT_LT(db_iter->key().ToString(), old_key);


Run "make format" and see whether this is accepted.

Oh, make format, I forgot it exists. Running it on the whole PR now.

…k*() for all iterators

facebook-github-bot · 2018-05-17T04:59:59Z

@al13n321 has updated the pull request.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2018-05-17T05:35:49Z

@al13n321 has updated the pull request.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2018-05-17T06:58:40Z

@al13n321 has updated the pull request.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2018-05-17T07:36:39Z

@al13n321 has updated the pull request.

facebook-github-bot

@al13n321 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@al13n321 is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

siying · 2018-05-18T17:39:00Z

@al13n321 clang_analyze is still failing:

db/db_iter_stress_test.cc:413:41: warning: Division by zero
    std::string s = ToString(rnd.Next() % (uint64_t)max_key);
                             ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~

siying · 2018-05-18T17:40:02Z

By the way, Google C++ Style (which we are following) disallow C-style casting. In this case uint64_t{max_key} is preferred.

miasantreble · 2018-05-18T17:59:25Z

#3872 will fix clang_analyze

al13n321 requested review from lightmark and siying May 5, 2018 03:06

facebook-github-bot added the CLA Signed label May 5, 2018

facebook-github-bot reviewed May 5, 2018

View reviewed changes

al13n321 force-pushed the itrefactor branch from cd9437f to e48e7cc Compare May 5, 2018 03:22

al13n321 commented May 5, 2018

View reviewed changes

al13n321 force-pushed the itrefactor branch from e48e7cc to 72c2cef Compare May 7, 2018 22:32

al13n321 force-pushed the itrefactor branch from 72c2cef to 7ffabfa Compare May 8, 2018 02:28

facebook-github-bot reviewed May 8, 2018

View reviewed changes

maysamyabandeh assigned siying May 8, 2018

siying reviewed May 8, 2018

View reviewed changes

al13n321 force-pushed the itrefactor branch from 7ffabfa to 04976fc Compare May 10, 2018 02:53

al13n321 force-pushed the itrefactor branch from 04976fc to f99584a Compare May 10, 2018 22:58

facebook-github-bot reviewed May 10, 2018

View reviewed changes

siying approved these changes May 17, 2018

View reviewed changes

al13n321 added 2 commits May 16, 2018 21:36

Fix a bug causing kBlockCacheTier iterators to miss keys

a0e4b0f

Change and clarify the relationship between Valid(), status() and See…

fc94af7

…k*() for all iterators

al13n321 force-pushed the itrefactor branch from f99584a to 1ebafbd Compare May 17, 2018 04:59

facebook-github-bot reviewed May 17, 2018

View reviewed changes

al13n321 force-pushed the itrefactor branch from 1ebafbd to 151f0fd Compare May 17, 2018 05:35

facebook-github-bot reviewed May 17, 2018

View reviewed changes

al13n321 force-pushed the itrefactor branch from 151f0fd to d52cfcc Compare May 17, 2018 06:58

facebook-github-bot reviewed May 17, 2018

View reviewed changes

A stress test for DBIter

57466ac

al13n321 force-pushed the itrefactor branch from d52cfcc to 57466ac Compare May 17, 2018 07:36

facebook-github-bot reviewed May 17, 2018

View reviewed changes

adamretter mentioned this pull request May 17, 2018

Potential problem in RocksDBSample.java according to the description on the wiki #3864

Closed

facebook-github-bot closed this in 8bf555f May 17, 2018

al13n321 mentioned this pull request May 17, 2018

Iterator::Valid() returns true when Iterator::status() is not Status::Ok #3558

Closed

Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs #3810

Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs #3810

Conversation

al13n321 commented May 5, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented May 5, 2018

Choose a reason for hiding this comment

al13n321 May 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

al13n321 May 5, 2018 • edited Loading

Choose a reason for hiding this comment

facebook-github-bot commented May 7, 2018

facebook-github-bot commented May 8, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

al13n321 May 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented May 10, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented May 10, 2018

al13n321 commented May 11, 2018

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented May 17, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented May 17, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented May 17, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented May 17, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

siying commented May 18, 2018

siying commented May 18, 2018

miasantreble commented May 18, 2018

al13n321 May 5, 2018 •

edited

Loading

al13n321 May 5, 2018 •

edited

Loading

al13n321 May 10, 2018 •

edited

Loading