Performance degrading a lot with high number of keys #212

crystalin · 2023-06-21T20:43:50Z

Running Strorage Benchmark on 3 different networks with significant state size/content results in incoherent results.
We have beeen using Moonbeam v0.32.1 which is based on substrate 0.9.40.
The network Alphanet and Moonriver have similar state/usage overall, but Moonbeam had a project that generate a huge amount of storage (all of the same size, 42 bytes IIRC).

As you can see, the Moonbeam read and write using paritydb are way off the expected result that we see in alphanet and moonriver.

Configuration of the disk is AWS gp3 | 1000 GiB | 3000 IOPS and each network/db has its own disk (total of 6 disks).
The blocks and state are pruned to avoid having a huge disk space.

Running the storage benchmark (on c6i.4xlarge AWS):

/home/ubuntu/projects/moonbeam/target/release/moonbeam
   benchmark
   storage
   --db=${DB}
   --state-version=0
   --mul=1.1
   --weight-path  /home/ubuntu/projects/moonbeam/weights-${DB}-${NETWORK}.rs
   --chain ${NETWORK}
   --base-path /var/lib/${DB}-${NETWORK}-data

for each chain

Alphanet (~20M keys):

pub const RocksDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 65_167 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 114_721 * constants::WEIGHT_REF_TIME_PER_NANOS,
};
pub const ParityDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 16_290 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 65_374 * constants::WEIGHT_REF_TIME_PER_NANOS,
};

Moonriver (~30M keys state):

pub const RocksDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 66_865 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 114_947 * constants::WEIGHT_REF_TIME_PER_NANOS,
};
pub const ParityDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 14_483 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 64_545 * constants::WEIGHT_REF_TIME_PER_NANOS,
};

Moonbeam (~110M keys state):

pub const RocksDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 33_439 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 86_828 * constants::WEIGHT_REF_TIME_PER_NANOS,
};
pub const ParityDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 177_320 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 69_450 * constants::WEIGHT_REF_TIME_PER_NANOS,
};

Additionally to the paritydb numbers, we can also see that RocksDB average read is 50% on Moonbeam (110M keys) than Moonriver (30M keys), which might be related to the size of the data on Moonbeam being on average smaller than on Moonriver.

Details about the Benchmark output can be found there:
https://gist.github.com/crystalin/8e790a554b246e077c83ad04c04f330c

The text was updated successfully, but these errors were encountered:

crystalin · 2023-06-21T20:46:38Z

Additionally, it took something like 40h to generate the moonbeam storage benchmark

ggwpez · 2023-06-22T11:54:59Z

@cheme do you have an idea why the ParityDB time for read on the 110M keys DB is so much slower than Rocks when it is normally faster?

crystalin · 2023-06-22T12:01:52Z

You can download a recent Moonbeam state: https://s3.console.aws.amazon.com/s3/object/alan-stuff?region=us-east-1&prefix=moonbeam-state-3631095.json.lz4 (10GB) if you want to check it

cheme · 2023-06-22T12:06:36Z

That is definitely not expected. Could imagine worst access on big mmap memory, but not something in these proportions. Can also think of the data not being correctly build (there is a reindexing running in background every N values, but this is flushed on exit/start).

" based on substrate 0.9.40." : is it substrate version (looks old)?
Would be interesting to have the parity-db version listed in the Cargo.lock (a version from a few month ago did have an issue that could explan some bad behavior cc\ @arkpar ).

Edit: https://github.com/PureStake/moonbeam/blob/6ed87ceeb65db27a9b2ce7ff32b90d062540bd67/Cargo.lock#L8942 parity db version is 0.4.6 which do not have #206 but I don't expect it to be related.

crystalin · 2023-06-22T12:52:44Z

I'm happy to cherry-pick some changes on top of it if you want to test few things. You can also probably reproduce by using the snapshot I provided

cheme · 2023-06-22T12:59:25Z

I'm happy to cherry-pick some changes on top of it if you want to test few things. You can also probably reproduce by using the snapshot I provided

Would be using latest version of parity db (cargo update -p parity-db), but then it only really would make sense if synching the snapshot from scratch.

Something I am thinking right now, did the memory cusumption stay correct during the process (looking at the bench code I suspect it could put many items in memory)?

Edit: just realize the snapshot is in json format so no need to resynch.

cheme · 2023-06-22T13:02:57Z

actually would be better patched parity-db master to include #211

crystalin · 2023-06-22T13:30:31Z

Ok I'll try that if I find time (also be aware that the benchmark took 40 hours so I won't get result quickly)

ggwpez · 2023-06-22T13:30:54Z

I cant even import the snap on a 64GB server… do you use 128?

arkpar · 2023-06-27T17:41:55Z

I've tried using warp sync on moonbeam. The sync went fine, although peak RAM usage was over 130GB. However the parachain is not finalizing blocks. Final block is still at zero. Is this a known issue? Unfinalized blocks are stored differently in the DB and this may affect performance.

arkpar · 2023-06-27T17:46:10Z

As for possible performance issues, it could be affected by how the benchmark is implemented. RocksDB uses its own caching, while ParityDb relies on the OS cache. IIRC the benchmark warmup touches a few of the keys, and for RocksDB this causes a lot more data to be pre-cached.

crystalin · 2023-06-27T17:52:10Z

@arkpar warp is not fully supported yet. We are still working on it.
I also suspect the benchmark implementation is the reason for those unexpected values, but it is hard/long to verify

crystalin · 2023-07-03T10:36:21Z

@arkpar were you able to reproduce? Let me know if I can help otherwise

arkpar · 2023-07-05T07:53:29Z

I could not access the snapshot linked above. It requires AWS registration and asks for my credit card number. I've started regular sync instead and it looks like it will take 3-4 days.

arkpar · 2023-07-21T11:03:26Z

@crystalin Could you give it a test with parity-db 0.4.10?
cargo update -p parity-db should do it

crystalin · 2023-07-21T17:33:33Z

I'm running it now.
This time I looked at the CPU load and IO load, and during the benchmark:

IOPS: ~1300 (max is 3000)
CPU: 2.5%
Memory: 10% (max: 32Gb)

ggwpez · 2023-07-21T20:15:05Z

If the DB benchmark time is a major problem then we could add a flag to only read 10% or 1% of the total keys (randomly selected). That way you would have some preliminary results for faster iterating. Do you think that would help?

crystalin · 2023-07-21T20:53:49Z

That could make sense yes a % flag

crystalin · 2023-07-21T20:54:07Z

Warmup round just finished, I might get result this WE
(Also memory jump to (95%)

crystalin · 2023-07-24T21:49:21Z

I was able to run it (with substrate 0.9.43 and paritydb 0.4.10). It took 3 days to finish:

pub const ParityDbWeight: RuntimeDbWeight = RuntimeDbWeight {
  read: 182_722 * constants::WEIGHT_REF_TIME_PER_NANOS,
  write: 60_176 * constants::WEIGHT_REF_TIME_PER_NANOS,
};

(No improvement at all)

cheme · 2023-07-31T10:18:29Z

I did check a bit more how to switch the chainspec loading to something that do not load all state in memory, but it is a bit more work than I did expect (break a lot the genesis build api since we need to do multiple commit while using a streaming json parser), so I postpone doing this myself for now.
Still I got a better understanding of the benchmarking process and it just use the standard chainspec loading, which means that the full state is send in parity db but the bench run on a db that just got a lot of key injected.
So the db may still be doing one or two levels of table reindexing when doing its benchmark, which would explain the performance issue.

This can be check by doing "ls" on the db directory and looking at the file for the state column:
if it is still reindexing the state there will multiple file named paritydb/full/index_01_xx with xx being the index sizes.

If this is the case I do not have of a simple way of ensuring reindexing (changing default index size to paritydb can be a a hacky solution).

The following change in substrate would allow flushing the logs but would not force all reindexing to finish.

--- a/bin/node/cli/src/command.rs
+++ b/bin/node/cli/src/command.rs
@@ -127,6 +127,8 @@ pub fn run() -> Result<()> {
                                        ),
                                        #[cfg(feature = "runtime-benchmarks")]
                                        BenchmarkCmd::Storage(cmd) => {
+                                               // load once first to ensure db is flushed.
+                                               new_partial(&config)?;
                                                // ensure that we keep the task manager alive
                                                let partial = new_partial(&config)?;
                                                let db = partial.backend.expose_db();

but it would need to keep db open for a while until everything is reindex too.

Maybe simply doing the bench in two steps:

step 1 load chainspec (eg just start the bin with no connection to ensure only chainspec loading will progress, or have a specific command to do so). Then wait until there is no more reindexing in paritydb (single index_01_xx file) until exiting.
step 2 run benchmark on existing db.

Or implement a primitive that ensure all reindexing is finished in paritydb and use it before calling new_partial a second time (but it will not be very elegant as the code at this level do not assume a specific db).

crystalin · 2023-08-21T09:43:53Z

Thank you,

I think we did run the node with no connection (we often do for other profiling parts) before running the benchmark, but I can try again to see if that helps.

I think having substrate support the storage benchmark on a substate of the state would probably be more effective in that case.

ggwpez · 2023-08-21T16:10:51Z

Yes I hope to get paritytech/polkadot-sdk#146 to some newcomer to solve. Forwarded it to a PBA student now.

crystalin · 2023-09-12T18:14:38Z

Outside of the storage benchmark, the performances of paritydb are also generally worse than rocksdb when the state is large (100M+ keys) and doing archive (I don't know how to measure to total number of keys in the db itself):

ParityDb:

2023-09-12T15:34:57.659Z utils:storage-query Queried 55384 keys @ 2769 keys/sec, 34 MB heap used
2023-09-12T15:35:02.659Z utils:storage-query Queried 82671 keys @ 3307 keys/sec, 46 MB heap used
2023-09-12T15:35:07.659Z utils:storage-query Queried 103743 keys @ 3458 keys/sec, 27 MB heap used
2023-09-12T15:35:12.659Z utils:storage-query Queried 130776 keys @ 3736 keys/sec, 21 MB heap used
2023-09-12T15:35:17.659Z utils:storage-query Queried 159459 keys @ 3986 keys/sec, 33 MB heap used
2023-09-12T15:35:22.659Z utils:storage-query Queried 184760 keys @ 4106 keys/sec, 18 MB heap used

RocksDb:

2023-09-12T15:36:44.978Z utils:storage-query Queried 520850 keys @ 17358 keys/sec, 30 MB heap used
2023-09-12T15:36:49.978Z utils:storage-query Queried 638850 keys @ 18249 keys/sec, 17 MB heap used
2023-09-12T15:36:54.979Z utils:storage-query Queried 784850 keys @ 19618 keys/sec, 15 MB heap used
2023-09-12T15:36:59.979Z utils:storage-query Queried 894850 keys @ 19882 keys/sec, 20 MB heap used
2023-09-12T15:37:04.981Z utils:storage-query Queried 975850 keys @ 19514 keys/sec, 24 MB heap used

ggwpez mentioned this issue Aug 24, 2023

Decrease RAM usage on startup paritytech/polkadot-sdk#3

Open

kogeler mentioned this issue Jul 19, 2023

sync issue for big databases (archive nodes) #215

Open

ggwpez mentioned this issue Jul 27, 2023

branchmark storage: Arg to read a fraction of all KV pairs paritytech/polkadot-sdk#146

Closed

kogeler mentioned this issue Oct 18, 2023

Switch to ParityDB by default; deprecate RocksDB paritytech/polkadot-sdk#1792

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degrading a lot with high number of keys #212

Performance degrading a lot with high number of keys #212

crystalin commented Jun 21, 2023 •

edited

Loading

crystalin commented Jun 21, 2023

ggwpez commented Jun 22, 2023

crystalin commented Jun 22, 2023

cheme commented Jun 22, 2023 •

edited

Loading

crystalin commented Jun 22, 2023

cheme commented Jun 22, 2023 •

edited

Loading

cheme commented Jun 22, 2023

crystalin commented Jun 22, 2023

ggwpez commented Jun 22, 2023

arkpar commented Jun 27, 2023

arkpar commented Jun 27, 2023

crystalin commented Jun 27, 2023

crystalin commented Jul 3, 2023

arkpar commented Jul 5, 2023

arkpar commented Jul 21, 2023

crystalin commented Jul 21, 2023

ggwpez commented Jul 21, 2023 •

edited

Loading

crystalin commented Jul 21, 2023

crystalin commented Jul 21, 2023 •

edited

Loading

crystalin commented Jul 24, 2023 •

edited

Loading

cheme commented Jul 31, 2023

crystalin commented Aug 21, 2023

ggwpez commented Aug 21, 2023 •

edited

Loading

crystalin commented Sep 12, 2023

Performance degrading a lot with high number of keys #212

Performance degrading a lot with high number of keys #212

Comments

crystalin commented Jun 21, 2023 • edited Loading

crystalin commented Jun 21, 2023

ggwpez commented Jun 22, 2023

crystalin commented Jun 22, 2023

cheme commented Jun 22, 2023 • edited Loading

crystalin commented Jun 22, 2023

cheme commented Jun 22, 2023 • edited Loading

cheme commented Jun 22, 2023

crystalin commented Jun 22, 2023

ggwpez commented Jun 22, 2023

arkpar commented Jun 27, 2023

arkpar commented Jun 27, 2023

crystalin commented Jun 27, 2023

crystalin commented Jul 3, 2023

arkpar commented Jul 5, 2023

arkpar commented Jul 21, 2023

crystalin commented Jul 21, 2023

ggwpez commented Jul 21, 2023 • edited Loading

crystalin commented Jul 21, 2023

crystalin commented Jul 21, 2023 • edited Loading

crystalin commented Jul 24, 2023 • edited Loading

cheme commented Jul 31, 2023

crystalin commented Aug 21, 2023

ggwpez commented Aug 21, 2023 • edited Loading

crystalin commented Sep 12, 2023

crystalin commented Jun 21, 2023 •

edited

Loading

cheme commented Jun 22, 2023 •

edited

Loading

cheme commented Jun 22, 2023 •

edited

Loading

ggwpez commented Jul 21, 2023 •

edited

Loading

crystalin commented Jul 21, 2023 •

edited

Loading

crystalin commented Jul 24, 2023 •

edited

Loading

ggwpez commented Aug 21, 2023 •

edited

Loading