-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read amplification on writes kills write-perf on large repositories #9678
Comments
Other issues I have found:
|
Another 5-10% of time saved by removing cbor decoding of blocks and just doing cbor decoding showed prominently in cpu profiles. The ipld-format.Batch code which sends batches in asynchronous fashion and with NumCPUs() parallelism does not seem to compensate the whole effort (for Pebble). Not doing that actually causes less write pressure (only 1 batch committed at a time). I have no idea if anyone measured anything before opting for the current double-decoding. |
Sounds like we could checksum the datastore in some way and save the current bloomfilter when stopping, then when restarting if the datastore checksum have not changed we would just load the existing bloomfilter. |
Or just disable bloom filter and let badger3/pebble/flatfs handle how to speed up lookups. |
Checklist
Description
In the past few days I have had the luxury of diving a little deeper to improve ingestion performance on Kubo, after
dag import
slowed to ridiculous speeds.My test setup uses Kubo from master (0.19.0-dev) in offline mode, with the Pebble datastore backend and ZFS setup on spinning disks with in-mem and NVME caching. The first thing to notice is that, while overall the write-rates were slow during
dag import
, read rates were maxed out.I discovered the following:
First, block-writes from the CAR are batched (ipld-format.Batch). These batches have a hardcoded values of :
Assuming any modern laptop has 20 cores, we arrive at batches of 400KB at most, or 6 items per batch (!). This are VERY low when trying to import several GBs and millions of blocks and result in a large number of transactions.
Both the blockservice and the blockstore layers perform a Has() on every element of the Batch under the assumption that doing Has() is cheaper than writing them.
The blockstore is additionally wrapped on a Bloom+ARC caches, but the default size of the bloom filter is 512KiB with 7 hashes and the ARC a mere 64KiB:
Anything that requires writing many thousands of blocks (let alone millions) will result in:
The happy path here is that a) the block is not known b) the user has configured a large-enough bloom filter, (I'm unsure of how much impact it is to hash the key 14 times).
The ugly side is when Has() hits the datastore backend. A modern backend like Badger or Pebble will include an additional BloomFilter and potentially a BlockCache too. However:
So I am running tests with
dag import
using Pebble as the backend, with a WriteThrough blockservice and Blockstore, no ARC/Bloomfilter, and increased number of Batch items and Batch Sizes and the time it takes to import ~10GB (~10M) blocks of DAG went from 60+minutes to 6 (a x10 improvement).Disk stats show how read-pressure went from being Maxed out at 500MB/s and being able to sustain few writes, to a more sustainable read pressure and higher write-throughput.
The issue with Has() calls was additionally confirmed from pprof profiles. My imported CAR files do have about 25% overlap with existing blocks on the repository, so I'm definitely not being able to enjoy the fast-path that hits the bloom-filter and nothing else (even though configured with a large size). The fact that Pebble backend reads also values for the keys on the Has() calls when the block exists does not make things better either, but flatfs would probably be even worse.
Overall this explains a lot of the perf bottleneck issues we have been seeing in the past when pinning large DAGs on Kubo nodes with large datastores (in this case using flatfs): the disks gets hammered from Read operations and this affects the speed at which writes and everything else can happen.
My recommendations would be the following:
The text was updated successfully, but these errors were encountered: