storage: use buckets in BoltDB backend #1748

fyrchik · 2021-02-16T10:44:50Z

Splitting 1-byte prefix seems to increas restore time by about 25% under heavy workload. However this difference is actually flat 25s gained only during first persist cycle (on master branch it takes 25s, while here it takes 1s). When restoring is done from the 10-th block, running times don't differ at all. Same thing with KeepOnlyLatestState: true.
Single-node benchmark performance difference is negligible.

Keys in bucket aren't stripped because it is prohibited to put empty key in bucket. With stripping implementation will become more complicate and I expect no performance gains because of the need to append prefix in Seek.

Some other bucketing strategies can be discussed, however it is obvious that most of the load is due to MPT.

Stats regarding keys in DB for this bench (100 blocks, 10k tx per block, single storage.Put for the same contract per tx):

DataMPT (stores MPT nodes) contains ~4M keys. Splitting this into another 256 buckets decreases B-tree depth from 4 to 3. The distribution is uniform because keys are hashes, so this makes sense.
DataTransaction and STStorage contain ~1M keys.
Other prefixes contain much less keys.

TODO:

Extend dump generation to include random NEO + GAS transfers.
Check if creating separate buckets for frequently used contracts (NEO, GAS) affects running time.
Use sub buckets for MPT.

fyrchik · 2021-02-17T14:40:49Z

Moving NEO/GAS contract storage to a separate bucket also doesn't affect restore time.

roman-khimov · 2021-02-17T21:10:02Z

pkg/core/storage/boltdb_store.go

- _, err = tx.CreateBucketIfNotExists(Bucket)
- if err != nil {
- return fmt.Errorf("could not create root bucket: %w", err)
+ for i := 0; i <= 255; i++ {


You can create less buckets, all real prefixes are well-known.

roman-khimov

Have you tried more blocks (1000)? And changing block/tx ratio (1K per block, 10K blocks for example)?

roman-khimov · 2021-02-18T07:41:50Z

pkg/core/storage/boltdb_store.go

- return b.Delete(key)
+ k1, k2 := split(key)
+ b := tx.Bucket(k1)
+ return b.Delete(k2)
 })
 }

 // PutBatch implements the Store interface.
 func (s *BoltDBStore) PutBatch(batch Batch) error {
 return s.db.Batch(func(tx *bbolt.Tx) error {


And we can use Update here, I think.

roman-khimov · 2021-02-18T07:43:06Z

pkg/core/storage/boltdb_store.go

+ }
+ return key[:1], key
+ default:
+ return key[:1], key


We have very limited number of cases where key[1:] is zero-length, maybe we should try fixing them and providing proper split here. I'm not sure it'll change anything, but who knows.

roman-khimov · 2021-02-18T07:43:26Z

pkg/core/storage/boltdb_store.go

+ if err != nil {
+ return err
+ }
+ _, err = tx.CreateBucketIfNotExists([]byte{byte(DataMPT), byte(i)})


MPT probably is not worth splitting.

1. Splitting 1-byte prefix _seems_ to decrease restore time by about 25% under heavy workload. However this difference is actually flat 25s gained only during first persist cycle (on master branch it takes 25s, while here it takes 1s). When restoring is done from the 10-th block, running times don't differ at all. 2. Single-node benchmark performance difference is negligible.

roman-khimov · 2021-02-26T20:21:48Z

OK, I've played with it a little and no matter what I do nothing really changes, it's all at the level of test tolerance. So just adding buckets doesn't magically improve performance and thus it's not worth doing.

roman-khimov reviewed Feb 18, 2021

View reviewed changes

fyrchik added 2 commits February 18, 2021 10:55

scripts: extend dump generation

67c1aa7

roman-khimov closed this Feb 26, 2021

roman-khimov mentioned this pull request Feb 26, 2021

Experiment with BoltDB buckets #1586

Closed

roman-khimov deleted the storage/buckets branch May 30, 2022 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: use buckets in BoltDB backend #1748

storage: use buckets in BoltDB backend #1748

fyrchik commented Feb 16, 2021 •

edited

Loading

fyrchik commented Feb 17, 2021

roman-khimov Feb 17, 2021

roman-khimov left a comment

roman-khimov Feb 18, 2021

roman-khimov Feb 18, 2021

roman-khimov Feb 18, 2021

roman-khimov commented Feb 26, 2021

storage: use buckets in BoltDB backend #1748

storage: use buckets in BoltDB backend #1748

Conversation

fyrchik commented Feb 16, 2021 • edited Loading

fyrchik commented Feb 17, 2021

roman-khimov Feb 17, 2021

Choose a reason for hiding this comment

roman-khimov left a comment

Choose a reason for hiding this comment

roman-khimov Feb 18, 2021

Choose a reason for hiding this comment

roman-khimov Feb 18, 2021

Choose a reason for hiding this comment

roman-khimov Feb 18, 2021

Choose a reason for hiding this comment

roman-khimov commented Feb 26, 2021

fyrchik commented Feb 16, 2021 •

edited

Loading