-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embed compilation cache contents via go:embed #1733
Comments
See: Or: If you're creating a library, putting the embed file in another package that sets a Do cache the compilation result, but don't compile on |
I do not want to embed the |
I'll reopen, and leave it to the core team to address, but I'm pretty sure that won't happen any time soon. I have in the past 1 tried to adapt the compilation cache to use embed. So, it is possible. But it's not very practical, for various reasons. Footnotes |
I don't think we have support for this considering the complexity vs the practical gain. Why not just use the temporary dir? |
For me, one of the major appeals of the Go runtime/ecosystem is its focus on the operational simplification side. It typically has minimal runtime dependencies, e.g., it only needs the kernel, it only generates a single-file artifact/binary, and the binary can be run from a read-only filesystem. IMHO, this should also extend to Go based libraries/runtimes, like wazero. In wazero case, for me, it means, it should have a simpler operational side too; I see that as having a way to ship as a single-file binary that can also run from a read-only filesystem. FWIW, I do not really known the wazero implementation. I've only actually used it for the first time yesterday, so I barely scratched its surface. I just wanted to say that this is one of its aspects that I find a bit unexpected, and it should have a way to load everything from the go |
cache is CPU-specific (not only just architecture but also depending on the generation of architecture). and you really don't want to embed inside the executable - how do you know exact CPU architecture of users of your executable? |
what I am saying is the cache you mentioned |
that's why @ncruces mentioned and everyone in the community does is to embed |
Having said that @ncruces said also that it is possible to provide read-only caches - so what I really want to see is compelling use real world cases and real number of/reasons why embedding |
without knowing the long-term goals of wazero, one way is to redefine/simplify the problem and limit the emitted machine code like the go compiler does with
indeed, that part was quite straightforward and worked nicely at my first try of the wazero library. not being able to embed all the compiled binary code is what surprised me. please note that embedding the
I did look at the suggestion and I'm still trying to understand it. from what I can tell from the comments at https://github.com/tetratelabs/wazero/blob/v1.5.0/cache.go#L19-L27, it seems its not possible to provide our own cache implementation without forking wazero. and from what you've said, it seems that currently its-not/will-never-be possible to limit the emitted instructions, so this looks like it will be quite infeasible to implement/maintain outside wazero.
as I've tried to describe, not having a cache eliminates a whole class of problems. having it embedded minimizes those problems, while compromising the instruction set that can be used. |
The compiled code actually does not (currently) depend on A JIT can make an important assumption: the CPU generating the code is the same CPU that runs the code. This means it can test CPU features at compile time once, and then generate code accordingly. The Go compiler, OTOH, has to gate usages of these kinds of CPU instructions, or not use them at all. This is not theoretical, here's two cases where the generated code already depends on optional features of the wazero/internal/engine/compiler/impl_amd64.go Lines 1271 to 1277 in 3b8b3fb
wazero/internal/engine/compiler/impl_amd64.go Lines 1334 to 1340 in 3b8b3fb
Also, these code paths have been a source of issues in the past, because they're harder to test (see: #1111, #1112).
Yes, it's hard to maintain outside of wazero. I did that to understand if:
Maybe you can take this forward? ISTM wazero doesn't want to shoulder the complexities of this distribution model, and I actually agree with that sentiment. So, for 3 I think you want to figure out the minimal subset of APIs necessary to make it so you can handle the complexity outside wazero. Then, if you still think the benefits (speed of instantiation) would outweigh the costs (much larger binaries, complex I mean, if you find the right balance, I'll probably go as far as to add this option to my own libraries, so you have my upvote. I simple gave up after trying as "not worth the effort." |
I'm probably kinda mistaken about this, but JIT != AOT, which wazero advertises itself as ("Compiler compiles WebAssembly modules into machine code ahead of time (AOT)"). To me, AOT sets some expectations in my mind, like, it can cross-compile the code in a way that is not tied to the current CPU/hardware, its only tied to whatever CPU/features/hardware I mention in the compilation options. Maybe the "compile mode" of wazero should come with some kind of disclaimer about that. Since that is not what wazero currently does, it invalidates/answers the whole premise of trying to include the compiled wasm in the compiled go. It actually gave me another reason to be wary of the cache. I did appreciate all the explanations! Thank You! :-) |
JIT vs. ATO is blurry. Yes, wazero is not compiling right before execution only the bits that actually get executed. It compiles entire WASM modules at load time. Is “load time” “just in” or “ahead of time”? That's just naming. The important thing to consider here, is that wazero can assume the runtime CPU is the same as the compile time CPU. So it can test the compile time CPU for features and generate code for that specific CPU. If you break that with cache, it's on you to fix it. Even beyond CPU features, wazero is not built to cross compile. You'd have to change it to produce |
Is your feature request related to a problem? Please describe.
At rgl/python-wazero-poc I'm playing with the idea of creating a single-file binary that embeds python and a python script.
Is this even possible? If, so, how? :-)
Describe the solution you'd like
Having a way to embed the compiled wasm module in the go binary (e.g. via
//go:embed python.wasm.bin
).The text was updated successfully, but these errors were encountered: