Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeleteRange Bug: random test failed #5392

Closed
mm304321141 opened this issue May 31, 2019 · 3 comments
Closed

DeleteRange Bug: random test failed #5392

mm304321141 opened this issue May 31, 2019 · 3 comments

Comments

@mm304321141
Copy link
Contributor

mm304321141 commented May 31, 2019

Here is some code to test DeleteRange
I build this test at RocksDB v5.9 , and found DeleteRange has some bug ?
we switch to v5.18 recently , this test still touch assert ...
I tried branch master , it's still not fixed :c

is this test has issues ? or it's really a bug of DeleteRange ?

this test may take a few minutes , pls run in debug mode and be patient wait ~
./test /path/to/db/dir

#include <memory>
#include <vector>
#include <random>
#include <chrono>
#include <cctype>
#include <cassert>

#include <rocksdb/db.h>
#include <rocksdb/options.h>
#include <rocksdb/table.h>
#include <rocksdb/utilities/write_batch_with_index.h>
#include <utilities/merge_operators/string_append/stringappend2.h>

std::string get_key(size_t i)
{
  char buffer[32];
  snprintf(buffer, sizeof buffer, "%012zd ", i);
  return buffer;
}

std::string get_value(size_t i)
{
  static std::string str = "QWERTYUIOPASDFGHJKLZXCVBNM";
  size_t pos = i % str.size();
  std::string value = get_key(i);
  value.append(str.data() + pos, str.size() - pos);
  return value;
}

int main(int argc, const char* argv[])
{
  if (argc < 2) {
    fprintf(stderr, "db dir ?\n");
    return -1;
  }
  rocksdb::Options options;
  
  rocksdb::BlockBasedTableOptions bbto;
  //bbto.no_block_cache = true;
  bbto.block_cache = rocksdb::NewLRUCache(32ULL << 10, 8, false);
  
  //options.compaction_pri = rocksdb::kMinOverlappingRatio;
  options.allow_concurrent_memtable_write = false;
  options.max_open_files = 32768;
  options.target_file_size_multiplier = 1;
  options.max_bytes_for_level_multiplier = 8;
  options.num_levels = 7;
  options.max_subcompactions = 4;
  options.level_compaction_dynamic_level_bytes = true;
  
  options.base_background_compactions = 4;
  options.max_background_compactions = 8;
  options.level0_file_num_compaction_trigger = 8;
  options.level0_slowdown_writes_trigger = 20;
  options.level0_stop_writes_trigger = 40;
  
  options.merge_operator.reset(new rocksdb::StringAppendTESTOperator(','));
  
  std::vector<rocksdb::ColumnFamilyDescriptor> cfDescriptors;
  size_t file_size_base = 8ull << 20;
  
  options.compaction_style = rocksdb::kCompactionStyleLevel;
  options.write_buffer_size = options.target_file_size_base = file_size_base;
  options.max_bytes_for_level_base = options.target_file_size_base * 4;
  options.target_file_size_multiplier = 1;
  options.table_factory.reset(rocksdb::NewBlockBasedTableFactory(bbto));
  cfDescriptors.emplace_back(rocksdb::kDefaultColumnFamilyName, options);
  
  options.compaction_style = rocksdb::kCompactionStyleLevel;
  options.write_buffer_size = options.target_file_size_base = size_t(file_size_base * 1.618);
  options.max_bytes_for_level_base = options.target_file_size_base * 4;
  options.target_file_size_multiplier = 1;
  options.table_factory.reset(rocksdb::NewBlockBasedTableFactory(bbto));
  cfDescriptors.emplace_back("__system__", options);
  
  options.env->SetBackgroundThreads(options.max_background_compactions, rocksdb::Env::LOW);
  options.env->SetBackgroundThreads(options.max_background_flushes, rocksdb::Env::HIGH);
  
  options.create_if_missing = true;
  
  rocksdb::DBOptions dbo = options;
  dbo.create_missing_column_families = true;
  rocksdb::Status s, s0, s1;
  std::vector<rocksdb::ColumnFamilyHandle*> hs;
  
  //rocksdb::DestroyDB(argv[1], options);
  rocksdb::DB* db;
  s = rocksdb::DB::Open(dbo, argv[1], cfDescriptors, &hs, &db);
  if (!s.ok()) {
    fprintf(stderr, "%s\n", s.getState());
    return -1;
  }
  rocksdb::ColumnFamilyHandle *h0 = hs[0], *h1 = hs[1];
  
  rocksdb::ReadOptions ro;
  rocksdb::WriteOptions wo;
  rocksdb::FlushOptions fo;
  uint64_t count;
  std::string key, value;
  std::string value_out0;
  std::string value_out1;
  
  std::unique_ptr<rocksdb::Iterator> iter0(db->NewIterator(ro, h0));
  std::unique_ptr<rocksdb::Iterator> iter1(db->NewIterator(ro, h1));
  
  std::vector<const rocksdb::Snapshot*> snapshot;
  rocksdb::WriteBatchWithIndex b(rocksdb::BytewiseComparator(), 0, true, 0);
  std::mt19937_64 mt;
  for (count = 1000000; count < 1000000000; ++count)
  {
    std::uniform_int_distribution<uint64_t> uid(10000, count);
    
    value = get_value(count);
    size_t r = uid(mt);
    key = get_key(r);
    b.Merge(h0, key, value);
    b.Merge(h1, key, value);
    if (count % 3 == 0) {
      r = uid(mt);
      key = get_key(r);
      b.Put(h0, key, value);
      b.Put(h1, key, value);
    }
    if (count % 47 == 0) {
      key = get_key(uid(mt));
      b.Delete(h0, key);
      b.Delete(h1, key);
    }
    if (count % 3011 == 0) {
      size_t r0 = uid(mt);
      size_t r1 = uid(mt);
      auto key0 = get_key(r0);
      auto key1 = get_key(r1);
      if (key0 > key1) std::swap(key0, key1);
      b.DeleteRange(h0, key0, key1);
      b.DeleteRange(h1, key0, key1);
    }
    if (std::uniform_int_distribution<uint64_t>(0, 1 << 3)(mt) == 0) {
      s = db->Write(wo, b.GetWriteBatch());
      b.Clear();
    }
    if (count % 103 == 0) {
      ro.snapshot = nullptr;
      iter0.reset(db->NewIterator(ro, h0));
      iter1.reset(db->NewIterator(ro, h1));
    }
    if (count % 89 == 0) {
      if (snapshot.size() < 2) {
        snapshot.emplace_back(db->GetSnapshot());
      }
      else if (snapshot.size() > 50 || (mt() & 1) == 0) {
        auto i = mt() % snapshot.size();
        db->ReleaseSnapshot(snapshot[i]);
        snapshot.erase(snapshot.begin() + i);
      }
      else {
        snapshot.emplace_back(db->GetSnapshot());
      }
    }
    if (count % 23 == 0) {
      size_t si = mt() % (snapshot.size() + 1);
      ro.snapshot = si == snapshot.size() ? nullptr : snapshot[si];
      key = get_key(uid(mt));
      s0 = b.GetFromBatchAndDB(db, ro, h0, key, &value_out0);
      s1 = b.GetFromBatchAndDB(db, ro, h1, key, &value_out1);
      if (s0.IsNotFound()) {
        assert(s1.IsNotFound());
      }
      else {
        assert(!s1.IsNotFound());
        assert(value_out0 == value_out1);
      }
    }
    auto check_iter = [](rocksdb::Iterator* i0, rocksdb::Iterator* i1) {
      if (i0->Valid()) {
        assert(i0->key() == i1->key());
        assert(i0->value() == i1->value());
      }
      else {
        assert(!i1->Valid());
      }
    };
    if (count % 29 == 0) {
      key = get_key(uid(mt));
      iter0->Seek(key);
      iter1->Seek(key);
      check_iter(iter0.get(), iter1.get());
      iter0->SeekForPrev(key);
      iter1->SeekForPrev(key);
      check_iter(iter0.get(), iter1.get());
    }
  }
  
  for (auto s : snapshot) {
    db->ReleaseSnapshot(s);
  }
  iter0.reset();
  iter1.reset();
  db->DestroyColumnFamilyHandle(h0);
  db->DestroyColumnFamilyHandle(h1);
  
  delete db;
  
  return 0;
}

@mm304321141 mm304321141 changed the title DeleteRange random test bug ? DeleteRange Bug: random test failed May 31, 2019
@ajkr
Copy link
Contributor

ajkr commented May 31, 2019

Sorry you ran into this. There is a known problem that WriteBatchWithIndex is incompatible with DeleteRange yet does not return an error to the user. It was previously reported here: #5260.

@mm304321141
Copy link
Contributor Author

ok, let's remove the WriteBatchWithIndex
Assertion failed: (i0->value() == i1->value())

#include <memory>
#include <vector>
#include <random>
#include <chrono>
#include <cctype>
#include <cassert>

#include <rocksdb/db.h>
#include <rocksdb/options.h>
#include <rocksdb/table.h>
#include <utilities/merge_operators/string_append/stringappend2.h>

std::string get_key(size_t i)
{
  char buffer[32];
  snprintf(buffer, sizeof buffer, "%012zd ", i);
  return buffer;
}

std::string get_value(size_t i)
{
  static std::string str = "QWERTYUIOPASDFGHJKLZXCVBNM";
  size_t pos = i % str.size();
  std::string value = get_key(i);
  value.append(str.data() + pos, str.size() - pos);
  return value;
}

int main(int argc, const char* argv[])
{
  if (argc < 2) {
    fprintf(stderr, "db dir ?\n");
    return -1;
  }
  rocksdb::Options options;
  
  rocksdb::BlockBasedTableOptions bbto;
  //bbto.no_block_cache = true;
  bbto.block_cache = rocksdb::NewLRUCache(32ULL << 10, 8, false);
  
  //options.compaction_pri = rocksdb::kMinOverlappingRatio;
  options.allow_concurrent_memtable_write = false;
  options.max_open_files = 32768;
  options.target_file_size_multiplier = 1;
  options.max_bytes_for_level_multiplier = 8;
  options.num_levels = 7;
  options.max_subcompactions = 4;
  options.level_compaction_dynamic_level_bytes = true;
  
  options.base_background_compactions = 4;
  options.max_background_compactions = 8;
  options.level0_file_num_compaction_trigger = 8;
  options.level0_slowdown_writes_trigger = 20;
  options.level0_stop_writes_trigger = 40;
  
  options.merge_operator.reset(new rocksdb::StringAppendTESTOperator(','));
  
  std::vector<rocksdb::ColumnFamilyDescriptor> cfDescriptors;
  size_t file_size_base = 8ull << 20;
  
  options.compaction_style = rocksdb::kCompactionStyleLevel;
  options.write_buffer_size = options.target_file_size_base = file_size_base;
  options.max_bytes_for_level_base = options.target_file_size_base * 4;
  options.target_file_size_multiplier = 1;
  options.table_factory.reset(rocksdb::NewBlockBasedTableFactory(bbto));
  cfDescriptors.emplace_back(rocksdb::kDefaultColumnFamilyName, options);
  
  options.compaction_style = rocksdb::kCompactionStyleLevel;
  options.write_buffer_size = options.target_file_size_base = size_t(file_size_base * 1.618);
  options.max_bytes_for_level_base = options.target_file_size_base * 4;
  options.target_file_size_multiplier = 1;
  options.table_factory.reset(rocksdb::NewBlockBasedTableFactory(bbto));
  cfDescriptors.emplace_back("__system__", options);
  
  options.env->SetBackgroundThreads(options.max_background_compactions, rocksdb::Env::LOW);
  options.env->SetBackgroundThreads(options.max_background_flushes, rocksdb::Env::HIGH);
  
  options.create_if_missing = true;
  
  rocksdb::DBOptions dbo = options;
  dbo.create_missing_column_families = true;
  rocksdb::Status s, s0, s1;
  std::vector<rocksdb::ColumnFamilyHandle*> hs;
  
  //rocksdb::DestroyDB(argv[1], options);
  rocksdb::DB* db;
  s = rocksdb::DB::Open(dbo, argv[1], cfDescriptors, &hs, &db);
  if (!s.ok()) {
    fprintf(stderr, "%s\n", s.getState());
    return -1;
  }
  rocksdb::ColumnFamilyHandle *h0 = hs[0], *h1 = hs[1];
  
  rocksdb::ReadOptions ro;
  rocksdb::WriteOptions wo;
  rocksdb::FlushOptions fo;
  uint64_t count;
  std::string key, value;
  std::string value_out0;
  std::string value_out1;
  
  std::unique_ptr<rocksdb::Iterator> iter0(db->NewIterator(ro, h0));
  std::unique_ptr<rocksdb::Iterator> iter1(db->NewIterator(ro, h1));
  
  std::vector<const rocksdb::Snapshot*> snapshot;
  std::mt19937_64 mt;
  for (count = 1000000; count < 1000000000; ++count)
  {
    std::uniform_int_distribution<uint64_t> uid(10000, count);
    
    value = get_value(count);
    size_t r = uid(mt);
    key = get_key(r);
    db->Merge(wo, h0, key, value);
    db->Merge(wo, h1, key, value);
    if (count % 3 == 0) {
      r = uid(mt);
      key = get_key(r);
      db->Put(wo, h0, key, value);
      db->Put(wo, h1, key, value);
    }
    if (count % 47 == 0) {
      key = get_key(uid(mt));
      db->Delete(wo, h0, key);
      db->Delete(wo, h1, key);
    }
    if (count % 3011 == 0) {
      size_t r0 = uid(mt);
      size_t r1 = uid(mt);
      auto key0 = get_key(r0);
      auto key1 = get_key(r1);
      if (key0 > key1) std::swap(key0, key1);
      db->DeleteRange(wo, h0, key0, key1);
      db->DeleteRange(wo, h1, key0, key1);
    }
    if (count % 103 == 0) {
      ro.snapshot = nullptr;
      iter0.reset(db->NewIterator(ro, h0));
      iter1.reset(db->NewIterator(ro, h1));
    }
    if (count % 89 == 0) {
      if (snapshot.size() < 2) {
        snapshot.emplace_back(db->GetSnapshot());
      }
      else if (snapshot.size() > 50 || (mt() & 1) == 0) {
        auto i = mt() % snapshot.size();
        db->ReleaseSnapshot(snapshot[i]);
        snapshot.erase(snapshot.begin() + i);
      }
      else {
        snapshot.emplace_back(db->GetSnapshot());
      }
    }
    if (count % 23 == 0) {
      size_t si = mt() % (snapshot.size() + 1);
      ro.snapshot = si == snapshot.size() ? nullptr : snapshot[si];
      key = get_key(uid(mt));
      s0 = db->Get(ro, h0, key, &value_out0);
      s1 = db->Get(ro, h1, key, &value_out1);
      if (s0.IsNotFound()) {
        assert(s1.IsNotFound());
      }
      else {
        assert(!s1.IsNotFound());
        assert(value_out0 == value_out1);
      }
    }
    auto check_iter = [](rocksdb::Iterator* i0, rocksdb::Iterator* i1) {
      if (i0->Valid()) {
        assert(i0->key() == i1->key());
        assert(i0->value() == i1->value());
      }
      else {
        assert(!i1->Valid());
      }
    };
    if (count % 29 == 0) {
      key = get_key(uid(mt));
      iter0->Seek(key);
      iter1->Seek(key);
      check_iter(iter0.get(), iter1.get());
      iter0->SeekForPrev(key);
      iter1->SeekForPrev(key);
      check_iter(iter0.get(), iter1.get());
    }
  }
  
  for (auto s : snapshot) {
    db->ReleaseSnapshot(s);
  }
  iter0.reset();
  iter1.reset();
  db->DestroyColumnFamilyHandle(h0);
  db->DestroyColumnFamilyHandle(h1);
  
  delete db;
  
  return 0;
}

@ajkr
Copy link
Contributor

ajkr commented Jun 3, 2019

(1) Put, (2) DeleteRange, (3) Merge, then (4) Flush seems to be a sequence of death. Good job finding it. Our correctness test uses a merge operator that only returns the latest operand for the key so wouldn't be able to detect old versions reappearing. Seems like a big deficiency in retrospect.

Here's a simplified repro:

diff --git a/db/db_range_del_test.cc b/db/db_range_del_test.cc
index c862d3dde..479aaf9e4 100644
--- a/db/db_range_del_test.cc
+++ b/db/db_range_del_test.cc
@@ -490,6 +490,26 @@ TEST_F(DBRangeDelTest, CompactionRemovesCoveredMergeOperands) {
   ASSERT_EQ(expected, actual);
 }
 
+TEST_F(DBRangeDelTest, FlushDeleteRangeCoveringMerge) {
+  Options opts = CurrentOptions();
+  opts.merge_operator = MergeOperators::CreateUInt64AddOperator();
+  Reopen(opts);
+
+  std::string val;
+  PutFixed64(&val, 1);
+  ASSERT_OK(db_->Put(WriteOptions(), "key", val));
+  ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
+                             "key", "key_"));
+  ASSERT_OK(db_->Merge(WriteOptions(), "key", val));
+  ASSERT_OK(db_->Flush(FlushOptions()));
+
+  ReadOptions read_opts;
+  std::string expected, actual;
+  ASSERT_OK(db_->Get(read_opts, "key", &actual));
+  PutFixed64(&expected, 1);
+  ASSERT_EQ(expected, actual);
+}
+
 // NumTableFilesAtLevel() is not supported in ROCKSDB_LITE
 #ifndef ROCKSDB_LITE
 TEST_F(DBRangeDelTest, ObsoleteTombstoneCleanup) {

ltamasi pushed a commit that referenced this issue Jun 4, 2019
Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes #5392.
Pull Request resolved: #5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
ltamasi pushed a commit that referenced this issue Jun 4, 2019
Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes #5392.
Pull Request resolved: #5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
ajkr added a commit to cockroachdb/rocksdb that referenced this issue Jun 5, 2019
…ebook#5406)

Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes facebook#5392.
Pull Request resolved: facebook#5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
ajkr added a commit to cockroachdb/rocksdb that referenced this issue Jun 5, 2019
…ebook#5406)

Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes facebook#5392.
Pull Request resolved: facebook#5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
vagogte pushed a commit to vagogte/rocksdb that referenced this issue Jun 18, 2019
…ebook#5406)

Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes facebook#5392.
Pull Request resolved: facebook#5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
ajkr added a commit to cockroachdb/rocksdb that referenced this issue Jun 26, 2019
…ebook#5406)

Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes facebook#5392.
Pull Request resolved: facebook#5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
ifed01 pushed a commit to ceph/rocksdb that referenced this issue Nov 8, 2019
…ebook#5406)

Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes facebook#5392.
Pull Request resolved: facebook#5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
ajkr added a commit to ajkr/rocksdb that referenced this issue Feb 14, 2020
Summary: As discovered in facebook#5260 and facebook#5392, reads on the indexed batch do not account for range tombstones. So, return `Status::NotSupported` from `WriteBatchWithIndex::DeleteRange` until we properly support it.

Test Plan: added unit test
facebook-github-bot pushed a commit that referenced this issue Feb 18, 2020
Summary:
As discovered in #5260 and #5392, reads on the indexed batch do not account for range tombstones. So, return `Status::NotSupported` from `WriteBatchWithIndex::DeleteRange` until we properly support it.
Pull Request resolved: #5393

Test Plan: added unit test

Differential Revision: D19912360

Pulled By: ajkr

fbshipit-source-id: 0bbfc978ea015d64516ca708fce2429abba524cb
JunHe77 pushed a commit to JunHe77/rocksdb that referenced this issue Feb 20, 2020
)

Summary:
As discovered in facebook#5260 and facebook#5392, reads on the indexed batch do not account for range tombstones. So, return `Status::NotSupported` from `WriteBatchWithIndex::DeleteRange` until we properly support it.
Pull Request resolved: facebook#5393

Test Plan: added unit test

Differential Revision: D19912360

Pulled By: ajkr

fbshipit-source-id: 0bbfc978ea015d64516ca708fce2429abba524cb
yiwu-arbug pushed a commit to tikv/rocksdb that referenced this issue Aug 11, 2020
)

Summary:
As discovered in facebook#5260 and facebook#5392, reads on the indexed batch do not account for range tombstones. So, return `Status::NotSupported` from `WriteBatchWithIndex::DeleteRange` until we properly support it.
Pull Request resolved: facebook#5393

Test Plan: added unit test

Differential Revision: D19912360

Pulled By: ajkr

fbshipit-source-id: 0bbfc978ea015d64516ca708fce2429abba524cb
junhanLee95 pushed a commit to junhanLee95/rocksdb that referenced this issue Oct 24, 2022
…ebook#5406)

Summary:
Flush/compaction use `MergeUntil` which has a special code path to
handle a merge ending with a non-`Merge` point key. In particular if
that key is a `Put` we forgot to check whether it is covered by a range
tombstone. If it is covered then we must not include it in the following call
to `TimedFullMerge`.

Fixes facebook#5392.
Pull Request resolved: facebook#5406

Differential Revision: D15611144

Pulled By: sagar0

fbshipit-source-id: ba6a7863ca2d043f591de78fd0c4f4561f0c500e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants