Skip to content

basho/leveldb

Repository files navigation

leveldb: A key-value store
Authors: Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

The original Google README is now README.GOOGLE.

** Introduction

This repository contains the Google source code as modified to benefit
the Riak environment.  The typical Riak environment has two attributes
that necessitate leveldb adjustments, both in options and code:

- production servers: Riak often runs in heavy Internet environments:
  servers with many CPU cores, lots of memory, and 24x7 disk activity.
  Basho's leveldb takes advantage of the environment by adding
  hardware CRC calculation, increasing Bloom filter accuracy, and
  defaulting to integrity checking enabled.

- multiple databases open: Riak opens 8 to 128 databases
  simultaneously.  Google's leveldb supports this, but its background
  compaction thread can fall behind.  leveldb will "stall" new user
  writes whenever the compaction thread gets too far behind.  Basho's
  leveldb modification include multiple thread blocks that each
  contain prioritized threads for specific compaction activities.

Details for Basho's customizations exist in the leveldb wiki:

  http://github.com/basho/leveldb/wiki


** Branch pattern

This repository follows the Basho standard for branch management 
as of November 28, 2013.  The standard is found here:

https://github.com/basho/riak/wiki/Basho-repository-management

In summary, the "develop" branch contains the most recently reviewed
engineering work.  The "master" branch contains the most recently
released work, i.e. distributed as part of a Riak release.


** Basic options needed

Those wishing to truly savor the benefits of Basho's modifications
need to initialize a new leveldb::Options structure similar to the
following before each call to leveldb::DB::Open:

    leveldb::Options * options;

    options=new Leveldb::Options;

    options.filter_policy=leveldb::NewBloomFilterPolicy2(16);
    options.write_buffer_size=62914560;  // 60Mbytes
    options.total_leveldb_mem=2684354560; // 2.5Gbytes (details below)
    options.env=leveldb::Env::Default();


** Memory plan

Basho's leveldb dramatically departed from Google's original internal
memory allotment plan with Riak 2.0.  Basho's leveldb uses a methodology
called flexcache.  The technical details are here:

   https://github.com/basho/leveldb/wiki/mv-flexcache

The key points are:

- options.total_leveldb_mem is an allocation for the entire process,
  not a single database

- giving different values to options.total_leveldb_mem on subsequent Open
  calls causes memory to rearrange to current value across all databases

- recommended minimum for Basho's leveldb is 340Mbytes per database.  

- performance improves rapidly from 340Mbytes to 2.5Gbytes per database (3.0Gbytes
  if using Riak's active anti-entropy).  Even more is nice, but not as helpful.

- never assign more than 75% of available RAM to total_leveldb_mem.  There is
  too much unaccounted memory overhead (worse if you use tcmalloc library).

- options.max_open_files and options.block_cache should not be used.