Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize the Cranelift compilation loop #56

Closed
sunfishcode opened this issue Feb 26, 2019 · 4 comments
Closed

Parallelize the Cranelift compilation loop #56

sunfishcode opened this issue Feb 26, 2019 · 4 comments

Comments

@sunfishcode
Copy link
Member

The loop in compile_module in lib/environ/src/cranelift.rs iterates over all the functions in a module, compiling one at a time. Cranelift is designed to compile each function independently (and we've demonstrated it doing so in SpiderMonkey), so this loop ought to be relatively straightforward to parallelize. In the Rust ecosystem, the Rayon framework is the obvious choice for simple parallelization of loops like this. The main work will be arranging places to put the outputs (currently the functions and relocations vector elements).

@maxmcd
Copy link
Contributor

maxmcd commented Feb 27, 2019

I have a first pass at this:

use rayon::iter::ParallelBridge;
use rayon::prelude::{IntoParallelRefIterator, ParallelIterator};

/// Compile the module using Cranelift, producing a compilation result with
/// associated relocations.
pub fn compile_module<'data, 'module>(
    module: &'module Module,
    function_body_inputs: PrimaryMap<DefinedFuncIndex, &'data [u8]>,
    isa: &dyn isa::TargetIsa,
) -> Result<(Compilation, Relocations), CompileError> {
    let mut functions = PrimaryMap::new();
    let mut relocations = PrimaryMap::new();

    function_body_inputs
        .into_iter()
        .collect::<Vec<(DefinedFuncIndex, &&'data [u8])>>()
        .par_iter()
        .map(|(i, input)| {
            let func_index = module.func_index(*i);
            let mut context = Context::new();
            context.func.name = get_func_name(func_index);
            context.func.signature = module.signatures[module.functions[func_index]].clone();

            let mut trans = FuncTranslator::new();
            trans
                .translate(
                    input,
                    &mut context.func,
                    &mut FuncEnvironment::new(isa.frontend_config(), module),
                )
                .map_err(CompileError::Wasm)?;

            let mut code_buf: Vec<u8> = Vec::new();
            let mut reloc_sink = RelocSink::new();
            let mut trap_sink = binemit::NullTrapSink {};
            context
                .compile_and_emit(isa, &mut code_buf, &mut reloc_sink, &mut trap_sink)
                .map_err(CompileError::Codegen)?;
            Ok((code_buf, reloc_sink.func_relocs))
        })
        .collect::<Result<Vec<_>, CompileError>>()?
        .into_iter()
        .for_each(|(function, relocs)| {
            functions.push(function);
            relocations.push(relocs);
        });

    // TODO: Reorganize where we create the Vec for the resolved imports.
    Ok((Compilation::new(functions), relocations))
}

Any thoughts on how to avoid the additional allocations? Or other improvements?

I also think this pattern is a more standard use of rayon:

    use rayon::iter::ParallelBridge;
    use rayon::prelude::ParallelIterator;

    let mut functions = PrimaryMap::new();
    let mut relocations = PrimaryMap::new();

    function_body_inputs
        .into_iter()
        .par_bridge()
        .map(|(i, input)| {
            let func_index = module.func_index(i);

But for some reason when I use par_bridge some of my tests start trapping. The wasmtime tests all pass, but my wasabi tests output: Trap from within function run: wasm trap at 0x11006dbb1

@sunfishcode
Copy link
Member Author

I'm not too worried about the extra allocations in this part of the code; if that becomes a concern, I think we can add a fast path for small modules that doesn't launch any extra threads.

I'm not experienced with Rayon, but the documentation for ParallelBridge suggests that it's not the most efficient for dealing with iterators which could be parallel, and PrimaryMap is just a Vec inside.

In any case, does par_bridge() guarantee that the functions will be collected in their original order? If they came out in a different order that could cause lots of trouble because wasm uses function indices in lots of places.

@maxmcd
Copy link
Contributor

maxmcd commented Mar 2, 2019

Ah, you are correct: https://docs.rs/rayon/*/rayon/iter/trait.ParallelBridge.html

The resulting iterator is not guaranteed to keep the order of the original iterator.

So that seems to be the issue.

I think PrimaryMap could implement rayon::iter::IntoParallelIterator, but I've had a little trouble understanding how to get that working. And, since I don't have access to the underlying vec I'm not sure if would be functionally different.

edit: Ah, it seems implementing a foreign trait is disallowed. I've pr'd the non-par_bridge version. It retains order on the final collect call

@sunfishcode
Copy link
Member Author

This now fixed in #58. Thanks!

howjmay pushed a commit to howjmay/wasmtime that referenced this issue Jan 24, 2022
dhil added a commit to frank-emrich/wasmtime that referenced this issue Aug 20, 2023
dhil added a commit to dhil/wasmtime that referenced this issue Nov 20, 2023
mooori pushed a commit to mooori/wasmtime that referenced this issue Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants