Simple IO cost estimation based on SSD specs #7060

jakmeier · 2022-06-17T13:10:26Z

Getting a correct and deterministic gas estimation of IO costs going will take some time. (See #5995 and #7053 + #7058 + #7059 for current efforts in that direction.)

In the meantime, we want an alternative a approach that serves as a sanity-checking tool. Results only have to be approximate. We would only use it to validate that a gas parameter is roughly in-line with back-of-the-envelop calculations.

One way I want to explore to do this is the following:

For a gas parameter, consider how many reads and writes to persistent data storage are required for the workload. This should independent of implementation details.
Then define assumptions on SSD latency for reads and writes.
Compute the time to serve all requests sequentially and convert to gas using 1Pgas = 1s.

Here I would assume the implementation is "perfect" in the sense that no complicated data structure need to be traversed and in-memory caching is large enough to fit all data for a few blocks. At the same time, also assume that no "pre-fetching" happens, neither speculatively nor otherwise.

In other words, reading N unqiue values from disk is assumed to take exactly N IOPs, regardless of data locality. (As long as they are smaller than a block size = 4kB. Otherwise, each block requires at least one IOP)

This should give us a rough estimate of what performance we can expect from a client implementation. Let's do that for all parameters directly related to storage and see if we get similar gas numbers to what we have today.

jakmeier · 2022-06-21T22:06:20Z

My replayer tool around currently parses the output described in ##7053.
I aim at merging that change first and then later submit the replayer code.

But the idea is that with such a replayer, it is very easy to extract such statistics and more. It already allows me to analyse trie node (shard and chunk) caches. See this output that it generates in one of its analysis options. It shows two receipts executed in the same block.

GET BlockInfo "E2zwRdCMngXBmKa7PLVc2tsdRqnoz6vzAV3MR3bTnbBp" size=221
      process_receipt predecessor=v1.token.sweat.testnet receiver=v1.token.sweat.testnet id=fWEKRcn8NeXscFUgYYq1gYfFZHSqyH29k97UJsQBqaZ chunk-cache-hit=13 chunk-cache-too-large=1
 chunk-cache-miss=1
        DB GET          122 requests for a total of   214507 B
        DB SET            0 requests for a total of        0 B
        STORAGE READ    136 requests for a total of       57 B
        STORAGE WRITE   136 requests for a total of     2217 B
        TRIE NODES    7903 /  27 / 121  (chunk-cache/shard-cache/DB)
        SHARD CACHE       18.24% hit rate
        CHUNK CACHE         98.16% hit rate,  98.49% if removing 27 shard cache hits
      process_receipt predecessor=v1.token.sweat.testnet receiver=v1.token.sweat.testnet id=2eTPuWAJNm5KX89HQtG32cbDRdh7goBrHBGb6bJdKeEb chunk-cache-too-large=1 chunk-cache-miss=
1 chunk-cache-hit=11
        DB GET          121 requests for a total of   214381 B
        DB SET            0 requests for a total of        0 B
        STORAGE READ    136 requests for a total of       57 B
        STORAGE WRITE   136 requests for a total of     2217 B
        TRIE NODES    7916 /   0 / 120  (chunk-cache/shard-cache/DB)
        SHARD CACHE        0.00% hit rate
        CHUNK CACHE       98.51% hit rate

We can see a bunch of interesting things here.

Chunk cache is very effective
The shard cache is useful for storage operations outside of function calls (e.g. loading contract state)
Inside fn calls, the shard cache was only useful for the first receipt, the second receipt was fully served by chunk cache.

(Just as an example. Actual analysis of this data should be done in storage optimization context.)

jakmeier · 2022-06-29T13:56:50Z

Here are some statistics for DB operations per storage operations. It is based on our current estimator runs and the new instrumentation I am working on.

First, the assumption is that 1DB operation = 1 disk access. Then I assume an IO operation latency of 143ns. This is derived from a measured 7000 IOPS on gcloud with a full NVME SSD and iodepth = 1. Then I inverted the 7000 IOPS, as iodepth=1 makes sequential accesses, this should be roughly what we can expect for sequential calls to the disk.

	Reads	Writes	TX / block	Inner Repetitions	DB op per host function call	Time [us] with SSD latency = 143ns	Time [us] with current gas price
StorageHasKeyBase	1713	52	2	1000	0.8825	126.2	54.0
StorageReadBase	2721	52	2	1000	1.3865	198.3	56.4
StorageWriteBase	188	2633	2	1000	1.4105	201.7	64.2

Conclusion

There are two ways of looking at this:

We undercharge by around a factor of 4 for storage, compared to this base-line.
In our estimations today, IO operations are around 4 times (12 if we factor in the safety multiplier of 3) more efficient than accessing the disk every time we have a DB operation. This is even with all caches turned off, as it is today's practice in the estimator. So the efficiency supposedly comes from RocksDB various features, such as fetching entire blocks at once and keeping them in memory.

Instrumentation

I collected these numbers running first

cargo run --release -p runtime-params-estimator --features=required,io_trace -- --metric time --iters 1 --warmup-iters 0 --record-io-trace=some_log.file

and then

cargo run -p runtime-params-estimator -- replay some_log.file --mode estimator-gas

It shows the total DB operations grouped by block measured in the estimator. This is not perfect, yet, as it require manual effort to find the important measurements and divide by the appropriate number as it is also done in the estimation. But for now, I just wanted to get the numbers and see if they are useful, before putting in too much effort in automation.

jakmeier · 2022-06-29T14:38:51Z

For other storage parameters, looking at actual DB calls in traces is not useful.

Per-byte costs are to cover larger sizes of requests, so the number of requests is not interesting. But we can use the rule mentioned in the initial issue description: Every 4kB requires at least one IOP.
Touching trie node cost should just be one single DB access for reads, by definition of what it tries to account for. And two if we count in the writes that happen at the end of the chunk, when trie changes are applied.

Looking at it from that perspective, using the same SSD assumptions from above, we get a baseline like this:

Trie Nodes

	Reads	Writes	Time [us] with SSD	Time [us] with current gas price
wasm_touching_trie_node	1	1	286	16.101955926

Per Byte storage cost

	Reads	Writes	Time [us] per 4kB with SSD	Time [us] per kB with current gas price
wasm_storage_read_value_byte	1/4096		143	22.98267648
wasm_storage_write_value_byte		1/4096	143	127.051935744
wasm_storage_write_evicted_byte	1/4096		143	131.552489472
wasm_storage_remove_ret_value_byte	1/4096		143	47.233253376

Again, the baseline shows higher numbers than current gas costs. Again, we can say this might be okay since the DB is more efficient than doing one IOP per DB operation. Especially since we haven't succeeded in creating benchmarks that reproducibly are slower to execute than covered by gas costs. But this should motivate us to try again finding such benchmarks.

jakmeier · 2022-06-29T14:42:21Z

Finally, for all remaining parameters, I am running estimations now and checking to check how many IO operations there actually are caused by them. The only remaining work here is to then pull out the numbers. I will post here when I have them.

jakmeier changed the title ~~Simple IO cost estimation~~ Simple IO cost estimation based on SSD specs Jun 17, 2022

jakmeier self-assigned this Jun 17, 2022

jakmeier added T-contract-runtime Team: issues relevant to the contract runtime team A-params-estimator Area: runtime params estimator labels Jun 17, 2022

jakmeier mentioned this issue Jun 21, 2022

feat: record detailed trace of IO operations #7052

Merged

exalate-issue-sync bot closed this as completed Jul 1, 2022

jakmeier mentioned this issue Sep 23, 2022

Robust IO cost measurements in estimator #5995

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple IO cost estimation based on SSD specs #7060

Simple IO cost estimation based on SSD specs #7060

jakmeier commented Jun 17, 2022 •

edited by exalate-issue-sync bot

Loading

jakmeier commented Jun 21, 2022

jakmeier commented Jun 29, 2022

jakmeier commented Jun 29, 2022

jakmeier commented Jun 29, 2022

Simple IO cost estimation based on SSD specs #7060

Simple IO cost estimation based on SSD specs #7060

Comments

jakmeier commented Jun 17, 2022 • edited by exalate-issue-sync bot Loading

jakmeier commented Jun 21, 2022

jakmeier commented Jun 29, 2022

Conclusion

Instrumentation

jakmeier commented Jun 29, 2022

Trie Nodes

Per Byte storage cost

jakmeier commented Jun 29, 2022

jakmeier commented Jun 17, 2022 •

edited by exalate-issue-sync bot

Loading