AOT Engine error: `Method too large` #405

bhelx · 2024-06-24T19:17:25Z

This is a known error, but wanted to create this for tracking and discussion:

Method too large: com/dylibso/chicory/$gen/CompiledModule.func_153

I believe the strategy here is to split the method up after it we finish it (assuming it breaks the limit). There is some prior art for this that works with ASM, but is fairly old. I assume there are some modern open source optimizers or tools that work with ASM we can use here.

The text was updated successfully, but these errors were encountered:

electrum · 2024-06-26T19:37:48Z

I'm working on a solution for this, with the goal to get Python running in AOT. The Python interpreter has a single huge method for its interpreter loop. Do you have other examples of WASM programs with large functions?

I looked at msplit, but couldn't get it to work, and I'm not sure it's the right approach anyway, as it's trying to split arbitrary bytecode that would never be generated by our AOT.

My current idea is to have a different code generation mode for large functions:

generate a state class (public primitive fields) for WASM locals instead of JVM locals
identify interior blocks to split out into separate methods
- choosing targets of a BR_TABLE is probably good enough for real programs
use exceptions to handle non-local jumps out of blocks in methods
- preallocate exception instances for performance (no stack trace overhead) when there is nothing to return

electrum · 2024-06-26T19:44:24Z

WASM locals are actually simpler than JVM locals, as they are declared up front for the function, whereas JVM locals can be reused with different types. Similarly, WASM blocks are more restrictive than JVM bytecode, as you can only jump to the start or end of a block, but in JVM bytecode there are fewer restrictions.

So while the idea of splitting bytecode seems nice from separation of concerns perspective, it seems more difficult to do, as you need to analyze it and rediscover the constraints from the original WASM. I'm exploring both approaches.

evacchi · 2024-06-26T21:14:59Z

I wonder if these old projects had a specific way to deal with this problem at codegen time

electrum · 2024-06-26T23:25:08Z

@evacchi Thanks for the pointers. Cibyl doesn't seem to handle large methods, which likely didn't matter as it was a developer environment targeting J2ME phones. You could adjust the original source code if the method size was a problem.

I remember reading the NestedVM paper many years ago. Reading it again, it's very interesting to see how similar it is to WebAssembly. They represent CPU registers as class fields and structure the code execution using the original binary addresses with an explicit program counter. The method size limit is worked around by using a trampoline transformation.

NestedVM is implementing an efficient interpreter for MIPS binaries, retaining their original semantics, whereas WASM is defined in a more abstract way and is intended to be efficiently transformed to the target environment. We do an idiomatic transformation to JVM bytecode rather than trying to implement an interpreter for the WASM abstract machine. This should allow the JVM JIT to generate code that is as efficient as possible.

WASM is a stack machine with a very similar structure to JVM bytecode, which allows us to translate in simple manner, instruction at a time, without needing to build an IR and full compiler. This is why we use static methods and pass Instance and Memory explicitly, rather than using a stateful class. Calling an instance method requires the receiver object to be pushed first, before the other parameters. But because we translate directly, by the time we get to the CALL instruction, the other parameters are already on the stack. So instead of an instance method, we use static methods with the special parameters at the end (and thus pushed last). This is why I'm leaning towards using an explicit state object rather than a stateful class.

bhelx · 2024-06-26T23:33:53Z

I'm working on a solution for this, with the goal to get Python running in AOT. The Python interpreter has a single huge method for its interpreter loop. Do you have other examples of WASM programs with large functions?

I tried this with my SNES emulator project. It maybe has a large function for the same reason as python. a large emulator loop. I'll upload the repo tomorrow if that's helpful.

My current idea is to have a different code generation mode for large functions:

That's an interesting idea. I hadn't really considered it because my assumption was you need to know the structure of the whole function before you can split it. It makes sense that msplit maybe doesn't work anymore. It's been a while since it's had an update.

electrum · 2024-06-26T23:46:08Z

I'll upload the repo tomorrow if that's helpful.

Please do!

andreaTP · 2024-06-27T19:50:27Z

Now that I read the evolution of this @enebo might be able to share his experience with Ruby here.

bhelx · 2024-06-27T20:45:20Z

@electrum Here is the SNES repo and an AOT branch https://github.com/dylibso/chicory-snes/tree/aot

you'll need to LEGALLY acquire an SNES rom to run it!

enebo · 2024-06-28T16:17:10Z

JRuby has had this issue but it has been rare enough that we never had enough motivation to address it. It only happened in a single case which was a project which used the state machine ragel to generate a Ruby parser. Our fallback when we cannot compile is to just interpret the code. So that library never ran very fast in JRuby. Prism came to the rescue so now that case has left the building.

We have also spent time examining bytecode to reduce how much we emit (and invokedynamic can help reduce overall bytecode). This is more of a mitigation than an actual solution.

JRuby's IR is a register-based IR and those registers end up being locals. For us to implement something like method splitting we have to deal with this by likely backing those into a primitive array and generating synthetic methods for the splits. Our IR design had not considered the need for it so I think we would do something different if we started again. Splitting is also desirable from the standpoint that uncommon paths can be split out so that the JVM can see a smaller common path method.

andreaTP · 2024-07-02T09:58:58Z

Thanks for sharing your experience @enebo !

Our fallback when we cannot compile is to just interpret the code.

Let see how this goes, but this is a safe net we might want to implement for multiple reasons.

electrum changed the title ~~ATO Engine error: Method too large~~ AOT Engine error: Method too large Jun 26, 2024

bhelx mentioned this issue Jul 30, 2024

AOT Engine Error: Class too large #453

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AOT Engine error: `Method too large` #405

AOT Engine error: `Method too large` #405

bhelx commented Jun 24, 2024

electrum commented Jun 26, 2024

electrum commented Jun 26, 2024

evacchi commented Jun 26, 2024

electrum commented Jun 26, 2024 •

edited

Loading

bhelx commented Jun 26, 2024

electrum commented Jun 26, 2024

andreaTP commented Jun 27, 2024

bhelx commented Jun 27, 2024

enebo commented Jun 28, 2024

andreaTP commented Jul 2, 2024

AOT Engine error: Method too large #405

AOT Engine error: Method too large #405

Comments

bhelx commented Jun 24, 2024

electrum commented Jun 26, 2024

electrum commented Jun 26, 2024

evacchi commented Jun 26, 2024

electrum commented Jun 26, 2024 • edited Loading

bhelx commented Jun 26, 2024

electrum commented Jun 26, 2024

andreaTP commented Jun 27, 2024

bhelx commented Jun 27, 2024

enebo commented Jun 28, 2024

andreaTP commented Jul 2, 2024

AOT Engine error: `Method too large` #405

AOT Engine error: `Method too large` #405

electrum commented Jun 26, 2024 •

edited

Loading