Offload PVF code decompression to a separate task on the blocking pool #5071

s0me0ne-unkn0wn · 2024-07-19T09:00:33Z

In #3122, we moved the whole candidate validation subsystem to the blocking pool, as it performs PVF code decompression, which is a blocking task, although a small one. That wasn't the perfect solution, but in the context of problem space, as it looked like at that point, that was acceptable. Now, with agile coretime and #5012 around the corner, we must be ready to prepare a lot more PVFs than before, and the problem with blocking the candidate validation with decompressions is much more concerning.

We should offload that work to a separate task to allow candidate validation subsys to await on it asynchronously.

s0me0ne-unkn0wn · 2024-07-19T12:56:34Z

@sandreim @alexggh @alindima I need a bit of advice here.

The idiomatic approach to run a lot of CPU-bound tasks is to do that on rayon threads (event tokio docs tell to use rayon in this case). But considering how small every task is and that we're unlikely to spawn 100s of them at the same time, I'm more inclined just to use tokio::spawn_blocking for them. WDYT, do you have some insights?

bkchr · 2024-07-19T14:42:37Z

We have the Spawner that provides a spawn_blocking. I don't see any reason why you should use rayon here.

sandreim · 2024-07-19T17:57:53Z

As @bkchr says, just spawn a blocking task. I did the same in availability recovery, you can use it as an example.

s0me0ne-unkn0wn · 2024-07-20T07:50:56Z

Okay, I went through the code once again, and it doesn't make much sense to me 🫠
Candidate validation subsys decompresses the PVF and the PoV just to observe their sizes in metrics. It doesn't use them in any other way and passes the decompressed blobs to the PVF host, which passes them to workers that are synchronous by their nature and are a totally natural place to do the decompression job.

To my mind, it should be refactored. The PVF decompression should be performed by the preparation worker, and the PoV decompression should go to the execution worker. The PVF host should return observable decompressed sizes to the candidate validation subsys (at the end of the day, it doesn't matter when we observe the sizes, before the validation or after it).

As I see it, the only drawback of this approach is that decompression times will be included in the preparation and execution timeouts. But they should be negligible anyway? Do we maybe have benchmarks of how fast the zstd decompression is on the reference hardware?

Any objections to this approach?

sandreim · 2024-07-22T16:06:57Z

Okay, I went through the code once again, and it doesn't make much sense to me 🫠 Candidate validation subsys decompresses the PVF and the PoV just to observe their sizes in metrics. It doesn't use them in any other way and passes the decompressed blobs to the PVF host, which passes them to workers that are synchronous by their nature and are a totally natural place to do the decompression job.

Not sure why you think it is not used, the code shows that the decompressed blob is being passed in ValidationParams.

polkadot-sdk/polkadot/node/core/candidate-validation/src/lib.rs

Line 655 in f13ed8d

block_data: raw_block_data,

s0me0ne-unkn0wn · 2024-07-22T16:14:57Z

the code shows that the decompressed blob is being passed in ValidationParams

Exactly, but the ValidationParams themselves are not used for anything other than passing them to the PVF execution worker to provide a PVF with parameters. Instead, we could pass persisted validation data and compressed PoV there and let the execution worker decompress PoV and construct ValidationParams itself.

My point is that everything that is currently done with the compressed data in the candidate validation subsys may be done in PVF preparation/execution workers as well, and nothing will change. Candidate validation subsys doesn't need that data decompressed for itself, it decompresses them for the PVF host.

sandreim · 2024-07-22T16:52:42Z

I see, this is a nice micro optimization. The impact of it depends on the number of pending validation requests in the queue and how long decompression takes. I think next step should be to measure and see if decompression significantly stalling the candidate validation loop.

AndreiEres · 2024-07-24T14:15:37Z

After we merged #4791, we decompress the same PVF one more time when we prepare the node for the active set, which looks unoptimal. This can be solved with a slight refactoring of the message arguments. But does that mean we no longer need the subsystem in the blocking pool?

s0me0ne-unkn0wn · 2024-07-24T17:35:41Z

I'm currently working on a PR that moves all the decompression to PVF host workers, stay tuned :)

Closes #5071 This PR aims to * Move all the blocking decompression from the candidate validation subsystem to the PVF host workers; * Run the candidate validation subsystem on the non-blocking pool again. Upsides: no blocking operations in the subsystem's main loop. PVF throughput is not limited by the ability of the subsystem to decompress a lot of stuff. Correctness and homogeneity improve, as the artifact used to be identified by the hash of decompressed code, and now they are identified by the hash of compressed code, which coincides with the on-chain `ValidationCodeHash`. Downsides: the PVF code decompression is now accounted for in the PVF preparation timeout (be it pre-checking or actual preparation). Taking into account that the decompression duration is on the order of milliseconds, and the preparation timeout is on the order of seconds, I believe it is negligible.

) Closes paritytech#5071 This PR aims to * Move all the blocking decompression from the candidate validation subsystem to the PVF host workers; * Run the candidate validation subsystem on the non-blocking pool again. Upsides: no blocking operations in the subsystem's main loop. PVF throughput is not limited by the ability of the subsystem to decompress a lot of stuff. Correctness and homogeneity improve, as the artifact used to be identified by the hash of decompressed code, and now they are identified by the hash of compressed code, which coincides with the on-chain `ValidationCodeHash`. Downsides: the PVF code decompression is now accounted for in the PVF preparation timeout (be it pre-checking or actual preparation). Taking into account that the decompression duration is on the order of milliseconds, and the preparation timeout is on the order of seconds, I believe it is negligible.

s0me0ne-unkn0wn mentioned this issue Jul 19, 2024

Prepare PVFs if node is a validator in the next session #4791

Merged

s0me0ne-unkn0wn self-assigned this Jul 19, 2024

s0me0ne-unkn0wn mentioned this issue Jul 25, 2024

Move PVF code and PoV decompression to PVF host workers #5142

Merged

s0me0ne-unkn0wn closed this as completed in #5142 Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offload PVF code decompression to a separate task on the blocking pool #5071

Offload PVF code decompression to a separate task on the blocking pool #5071

s0me0ne-unkn0wn commented Jul 19, 2024 •

edited

Loading

s0me0ne-unkn0wn commented Jul 19, 2024

bkchr commented Jul 19, 2024

sandreim commented Jul 19, 2024

s0me0ne-unkn0wn commented Jul 20, 2024

sandreim commented Jul 22, 2024 •

edited

Loading

s0me0ne-unkn0wn commented Jul 22, 2024

sandreim commented Jul 22, 2024 •

edited

Loading

AndreiEres commented Jul 24, 2024

s0me0ne-unkn0wn commented Jul 24, 2024

Offload PVF code decompression to a separate task on the blocking pool #5071

Offload PVF code decompression to a separate task on the blocking pool #5071

Comments

s0me0ne-unkn0wn commented Jul 19, 2024 • edited Loading

s0me0ne-unkn0wn commented Jul 19, 2024

bkchr commented Jul 19, 2024

sandreim commented Jul 19, 2024

s0me0ne-unkn0wn commented Jul 20, 2024

sandreim commented Jul 22, 2024 • edited Loading

s0me0ne-unkn0wn commented Jul 22, 2024

sandreim commented Jul 22, 2024 • edited Loading

AndreiEres commented Jul 24, 2024

s0me0ne-unkn0wn commented Jul 24, 2024

s0me0ne-unkn0wn commented Jul 19, 2024 •

edited

Loading

sandreim commented Jul 22, 2024 •

edited

Loading

sandreim commented Jul 22, 2024 •

edited

Loading