Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cli option --wasmtime-precompiled and subcommand precompile-wasm #1641

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

librelois
Copy link
Contributor

@librelois librelois commented Sep 20, 2023

Description

What does this PR do?

This PR adds the possibility of starting the node with a precompiled WASM runtime.
This eliminates the need to compile the runtime at startup, making it possible to launch the node much more quickly.

Why are these changes needed?

A practical example of where this feature is useful is when you need to run a large number of end2end tests quickly: you precompile the runtime once in the same environment, then use the precompiled artifact every time you need to spawn the node (potentially hundreds of times).

How were these changes implemented and what do they affect?

As a security measure, a hash of the wasmtime configuration is inserted into the precompiled artifact and checked by wasmtime at startup, to ensure that, to ensure that the precompiled artifact has been generated under the same conditions.

This check is delegated to wasmtime directly thanks to the module_version function.

How to use

  1. Generate the precompiled artifact with the subcommand precompile-wasm:
precompile-wasm --chain <your-chain-spec> path/to/artifacts/folder

This will precompile the on-chain wasm at the last finalized block and save the precompiled artifact in path/to/artifacts/folder.

  1. Use the precompiled artifact with the cli option --wasmtime-precompiled path/to/artifacts/folder

@librelois librelois marked this pull request as ready for review September 21, 2023 08:22
@koute
Copy link
Contributor

koute commented Sep 21, 2023

What's your motivation to add this? Is this only supposed to be used for tests?

I'd be fine to have this to speed up tests, but I'm not convinced we should support this for running normal nodes, as it would add extra complexity and be yet another place where something can go wrong for not much benefit.

There's also the issue that for PVFs in Polkadot we already have caching implemented, so this would essentially duplicate that functionality in an incompatible way. (Although the requirements are different, as for PVFs we need sandboxing, so it's not really obvious how you would unify those.)

How about we fundamentally simplify this and do something like this instead?

  • Remove the extra CLI arguments/subcommands you've added.
  • Remove all other changes outside of the sc-executor-wasmtime.
  • In sc-executor-wasmtime check if a special environment variable is set, let's say, WASMTIME_PRECOMPILED_CACHE_DIR or something like that.
  • If it is set then the executor would automatically and transparently use that directory as a cache.
  • As the key for the cache use BLAKE3 hash of the semantics struct, the raw .wasm blob and of the current executable (std::fs::read the std::env::current_exe); unlike CARGO_PKG_VERSION this would guarantee that the executable has to be bit-exact for cache to be used.
  • If there are errors anywhere in this process they should be warn! printed and ignored (that is, the node should compile the module normally)

This would be, I think, simpler, need less code, be more maintainable, be more foolproof, and be easier to use.

@bkchr
Copy link
Member

bkchr commented Sep 25, 2023

Yeah I was proposing to go this route. They are running a lot of tests in their CI and that got really slow with the removal of wasmi as they will now need to recompile it all the time.

I'm not sure that a env variable is the way to go here either. No one will find out about this env variable besides the three of us :P I would propose that you extend WasmExecutorBuilder to add a precompiled_path or whatever. The CLI stuff is really not required here in our repo. You can just configure the executor as you like in moonbeam and use a CLI argument or a env variable or whatever. I didn't check your pr in detail, but if you need functionality exposed to precompile a binary, it is also fine to expose this function to be used in moonbeam.

@koute
Copy link
Contributor

koute commented Sep 25, 2023

I'm not sure that a env variable is the way to go here either. No one will find out about this env variable besides the three of us :P I would propose that you extend WasmExecutorBuilder to add a precompiled_path or whatever.

Yeah, that also sounds good to me.

@crystalin
Copy link

The command should output the name of the file also IMO

@crystalin
Copy link

The expected wasm and the generated one don't have the same hash:

2023-09-28 09:38:09 artifact_version: 0x3bb7567dc2e545fc246e248f2d0d3e3da0ba14c65bb01948001586448da83e56
2023-09-28 09:38:09 wasm file_name: precompiled_wasm_0x0x91b7bcbc9e993dded10ed74930ca3f2805a470980b47889f4d5cff4aab63e952

@crystalin
Copy link

crystalin commented Sep 28, 2023

The wasm is actually loaded at multiple places with different hashes:
In resolve_state_version_from_wasm where it takes the :code value but uses a special hash (Which I think is due to not having storage yet as we are building genesis)

The other one is initiated by grandpa where it retrieve the runtime and hash in the call_executor, computed in the state-machine using the storage_hash

This mean that for the same code, we have 2 different hashes and so we won't load the precompiled wasm

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/3840831

@crystalin
Copy link

Something to verify. I replaced the hasher of the genesis code to use the Blake2Hasher to be compatible with the precompiled and on-chain wasm calculated hash:
https://github.com/paritytech/polkadot-sdk/pull/1641/files#diff-680ae468913e48e81b977d13c888b391a21b3254e62ed1cc40bfb0e7310b7389R51-R52

After that change and marking allow_missing_func_imports to false, I was able on Moonbeam to boot the node without any compilation.

Something to note also, by default the "precompile-wasm" command will use the default base-path if none is provided, making this command:
../target/release/moonbeam precompile-wasm -lwasmtime-runtime --chain moonbase-dev wasm not use the moonbase-dev chain spec code but the one from your $HOME/.local/share/moonbeam/chains/moonbase_dev which might not be the same

@crystalin
Copy link

The precompile-wasm actually store the chain data in the base-path, like when launching a network. I don't think we should include that

@librelois
Copy link
Contributor Author

The CLI stuff is really not required here in our repo. You can just configure the executor as you like in moonbeam and use a CLI argument or a env variable

This isn't a problem specific to moonbeam, it concerns all substrate chains that need to spawn a lot of nodes for their end2end tests. For example, Tanssi also has many end2end typescrit tests.

The precompile-wasm actually store the chain data in the base-path, like when launching a network. I don't think we should include that

This is linked to the fact that the command doesn't know if you want to retrieve the runtime code from the DB or not, so it has to be initialized. Adding an explicit flag to retrieve from chain spec should solve the problem.

@girazoki
Copy link
Contributor

Any update on this? as @librelois this would be heplful for other ecosystem projects as well (like Tanssi in this case)

tmpolaczyk pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request Feb 23, 2024
This cherry-pick should match this pull request:

paritytech#1641
girazoki pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request May 27, 2024
This cherry-pick should match this pull request:

paritytech#1641
girazoki pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request May 30, 2024
This cherry-pick should match this pull request:

paritytech#1641
tmpolaczyk pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request Aug 5, 2024
This cherry-pick should match this pull request:

paritytech#1641
tmpolaczyk pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request Aug 5, 2024
This cherry-pick should match this pull request:

paritytech#1641
@crystalin
Copy link

crystalin commented Nov 12, 2024

@bkchr Do you think this could be merged ? (We are maintaining it in our cherry-picks but it would be ideal to have it merged)
(if yes, I'll work on the conflicts and additional comments if any)

chexware pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request Nov 14, 2024
This cherry-pick should match this pull request:

paritytech#1641
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@librelois please apply my comments and merge master and then we can go ahead merging this.

substrate/client/executor/src/wasm_runtime.rs Outdated Show resolved Hide resolved
substrate/client/executor/src/wasm_runtime.rs Outdated Show resolved Hide resolved
);
let mut buffer = Vec::new();
buffer.extend_from_slice(code_hash);
buffer.extend_from_slice(sp_wasm_interface::VERSION.as_bytes());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the wasmtime version.

/// wasmtime modules.
///
/// By default there is no `wasmtime_precompiled_path` given.
pub fn with_wasmtime_precompiled_path(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also one with_optional_wasmtime_precompiled_path function. Then we don't need the if let Some code when setting up the builder.

@@ -234,6 +251,8 @@ pub struct WasmExecutor<H> {
cache_path: Option<PathBuf>,
/// Ignore missing function imports.
allow_missing_host_functions: bool,
/// TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo

Comment on lines +463 to +474
let mut file = std::fs::File::create(
wasmtime_precompiled_path.join(format!("precompiled_wasm_{}", &artifact_version)),
)
.map_err(|e| {
WasmError::Other(format!(
"Fail to create file 'precompiled_wasm_0x{}', I/O Error: {}",
&artifact_version, e
))
})?;
file.write_all(&serialized_precompiled_wasm).map_err(|e| {
WasmError::Other(format!("Fail to write precompiled artifact, I/O Error: {}", e))
})?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut file = std::fs::File::create(
wasmtime_precompiled_path.join(format!("precompiled_wasm_{}", &artifact_version)),
)
.map_err(|e| {
WasmError::Other(format!(
"Fail to create file 'precompiled_wasm_0x{}', I/O Error: {}",
&artifact_version, e
))
})?;
file.write_all(&serialized_precompiled_wasm).map_err(|e| {
WasmError::Other(format!("Fail to write precompiled artifact, I/O Error: {}", e))
})?;
std::fs::write(wasmtime_precompiled_path.join(format!("precompiled_wasm_{artifact_version}")), &serialized_precompiled_wasm).map_err(|e| {
WasmError::Other(format!("Fail to write precompiled artifact, I/O Error: {}", e))
})?;

code_fetcher: &sp_core::traits::WrappedRuntimeCode(
wasm_bytecode.as_slice().into(),
),
hash: sp_core::blake2_256(&wasm_bytecode).to_vec(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hash: sp_core::blake2_256(&wasm_bytecode).to_vec(),
hash: B::Hasher::hash(&wasm_bytecode).to_vec(),

The hash depends on the configured hasher.

Comment on lines +343 to +369
for entry in std::fs::read_dir(wasmtime_precompiled_dir).map_err(handle_err)? {
let entry = entry.map_err(handle_err)?;
if let Some(file_name) = entry.file_name().to_str() {
// We check that the artifact was generated for this specific artifact
// version and with the same wasm interface version and configuration.
if file_name.contains(&artifact_version.clone()) {
log::info!(
target: "wasmtime-runtime",
"Found precompiled wasm: {}",
file_name
);
// We change the version check strategy to make sure that the file
// content was serialized with the exact same config as well
maybe_compiled_artifact = Some((
entry.path(),
sc_executor_wasmtime::ModuleVersionStrategy::Custom(
artifact_version.clone(),
),
));
}
} else {
return Err(WasmError::Instantiation(
"wasmtime precompiled folder contain a file with invalid utf8 name"
.to_owned(),
))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to iterate? We know the artifact version and thus, can just generate the file name and check if the file exists?

use sp_state_machine::backend::BackendRuntimeCode;
use std::{fmt::Debug, path::PathBuf, sync::Arc};

/// The `precompile-wasm` command used to serialize a precompiled WASM module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also mention which WASM is actually pre-compiled.

// `prepare_runtime_artifact` and with the same wasmtime
// version and configuration.
unsafe {
sc_executor_wasmtime::create_runtime_from_artifact::<H>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sc_executor_wasmtime::create_runtime_from_artifact::<H>(
return sc_executor_wasmtime::create_runtime_from_artifact::<H>(

If you make this an early return, below we can move the create_runtime out of the if to only have it once.

chexware pushed a commit to moondance-labs/polkadot-sdk that referenced this pull request Nov 23, 2024
This cherry-pick should match this pull request:

paritytech#1641
RomarQ and others added 2 commits November 26, 2024 10:45
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Bastian Köcher <git@kchr.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants