perf: More efficient touch_range #1322

zlangley · 2025-01-29T04:09:44Z

Previously touch_range was implemented by calling touch_address on each address in pointer..pointer + len. But for persistent memory, in both MemoryMerkleChip and PersistentBoundaryChip, we actually care about touched aligned blocks of size 8. So most of the triggered calls to touch_address via a single touch_range were redundant, repeatedly querying the same hashmap at the same block index.

jonathanpwang · 2025-01-29T05:33:32Z

can you add explanation of what the optimization is, and summarize what the perf gain is for microbenchmarks as well as reth benchmark

zlangley · 2025-01-29T19:31:08Z

reth benchmark: https://github.com/axiom-crypto/openvm-reth-benchmark/blob/gh-pages/benchmarks-dispatch/refs/heads/patch-openvm-20250129165550/reth-ef1f1f920d2fe8a646a842c7ee15992cdb9d8cf3a9f9d9b58f7df7ebdd23eaf5.md

jonathanpwang · 2025-01-29T19:59:28Z

reth benchmark: https://github.com/axiom-crypto/openvm-reth-benchmark/blob/gh-pages/benchmarks-dispatch/refs/heads/patch-openvm-20250129165550/reth-ef1f1f920d2fe8a646a842c7ee15992cdb9d8cf3a9f9d9b58f7df7ebdd23eaf5.md

what is the before and after

jonathanpwang · 2025-01-29T20:07:43Z

previous nightly: https://github.com/axiom-crypto/openvm-reth-benchmark/blob/gh-pages/benchmarks-dispatch/refs/heads/main/reth-38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bed.md

jonathanpwang · 2025-01-29T20:08:40Z

seems to have no perf diff

jonathanpwang · 2025-01-29T20:10:35Z

crates/vm/src/system/memory/merkle/mod.rs

+        let first_address_label = address / CHUNK as u32;
+        let last_address_label = (address + len - 1) / CHUNK as u32;
+        for address_label in first_address_label..=last_address_label {
+            self.touch_node(0, as_label, address_label);


this is still internally calling the same function, I'm guessing the compiler just optimizes it out?

How would the compiler optimize it out? Previously this function would have been called N times, now it is called N/8 times.

jonathanpwang · 2025-01-29T20:12:20Z

Closing for now because it doesn't seem to have a real perf impact, and introduces more code

zlangley · 2025-01-29T20:21:29Z

@jonathanpwang I have a hard time believing this has no real perf impact. Local microbenchmark shows that MemoryInterface::touch_range is 17% of MemoryController::finalize before this change and 2.7% afterwards.

zlangley · 2025-01-29T23:19:41Z

still not seeing a lot of perf diff: https://github.com/axiom-crypto/openvm-reth-benchmark/blob/gh-pages/benchmarks-dispatch/refs/heads/main/reth-e9fe5226fd30a7646b0e44d3c1d0681ea961dcd5deeb55e4fd95bf88dd4dfd0f.md

I only expect reth.prove_e2e.block_21000000 trace_gen_time_ms to improve

zlangley · 2025-01-29T23:21:13Z

this is mimalloc: https://github.com/axiom-crypto/openvm-reth-benchmark/blob/gh-pages/benchmarks-dispatch/refs/heads/patch-openvm-20250129223724/reth-ad94cb1ed99fa48d73d343db7520bfbeb75e933a7703ab20be3e516bdc88a05c.md

Golovanov399 · 2025-01-30T00:25:19Z

Do we ever call touch_address after this change? If no, then it makes sense to remove this function as you did with all_addresses()

github-actions · 2025-01-30T15:39:50Z

group	app.proof_time_ms	app.cycles	app.cells_used	leaf.proof_time_ms	leaf.cycles	leaf.cells_used
verify_fibair	(-17 [-0.8%]) 2,156	513,786	18,710,395	-	-	-
fibonacci_program	(-295 [-5.6%]) 4,978	1,500,095	51,485,080	-	-	-
regex_program	(-395 [-2.6%]) 14,884	1,914,103	165,455,373	-	-	-
ecrecover_program	(+17 [+0.7%]) 2,522	284,567	15,055,723	-	-	-

Commit: d49e76a

Benchmark Workflow

zlangley · 2025-01-30T18:23:09Z

On my machine, trace gen for reth benchmark for block 21000000 seems to decrease by 45s (~9%). Not sure why this is not reflected on reth benchmark (not sure how memory allocator/machine architecture is really relevant here; this PR should just be strictly decreasing the number of calls to HashMap::insert).

zlangley requested review from Golovanov399 and jonathanpwang January 29, 2025 04:09

This comment has been minimized.

Sign in to view

jonathanpwang reviewed Jan 29, 2025

View reviewed changes

jonathanpwang closed this Jan 29, 2025

jonathanpwang reopened this Jan 29, 2025

This comment has been minimized.

Sign in to view

zlangley force-pushed the touch_range branch from 2779a29 to c23898d Compare January 30, 2025 15:32

perf: More efficient touch_range

d49e76a

zlangley force-pushed the touch_range branch from c23898d to d49e76a Compare January 30, 2025 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: More efficient touch_range #1322

perf: More efficient touch_range #1322

zlangley commented Jan 29, 2025 •

edited

Loading

This comment has been minimized.

jonathanpwang commented Jan 29, 2025

zlangley commented Jan 29, 2025

jonathanpwang commented Jan 29, 2025

jonathanpwang commented Jan 29, 2025

jonathanpwang commented Jan 29, 2025

jonathanpwang Jan 29, 2025

zlangley Jan 29, 2025

jonathanpwang commented Jan 29, 2025

zlangley commented Jan 29, 2025

This comment has been minimized.

zlangley commented Jan 29, 2025

zlangley commented Jan 29, 2025

Golovanov399 commented Jan 30, 2025

github-actions bot commented Jan 30, 2025

zlangley commented Jan 30, 2025 •

edited

Loading

perf: More efficient touch_range #1322

Are you sure you want to change the base?

perf: More efficient touch_range #1322

Conversation

zlangley commented Jan 29, 2025 • edited Loading

This comment has been minimized.

jonathanpwang commented Jan 29, 2025

zlangley commented Jan 29, 2025

jonathanpwang commented Jan 29, 2025

jonathanpwang commented Jan 29, 2025

jonathanpwang commented Jan 29, 2025

jonathanpwang Jan 29, 2025

Choose a reason for hiding this comment

zlangley Jan 29, 2025

Choose a reason for hiding this comment

jonathanpwang commented Jan 29, 2025

zlangley commented Jan 29, 2025

This comment has been minimized.

zlangley commented Jan 29, 2025

zlangley commented Jan 29, 2025

Golovanov399 commented Jan 30, 2025

github-actions bot commented Jan 30, 2025

zlangley commented Jan 30, 2025 • edited Loading

zlangley commented Jan 29, 2025 •

edited

Loading

zlangley commented Jan 30, 2025 •

edited

Loading