Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python.wasm roadmap #46

Open
5 of 8 tasks
assambar opened this issue Jan 27, 2023 · 31 comments
Open
5 of 8 tasks

Python.wasm roadmap #46

assambar opened this issue Jan 27, 2023 · 31 comments

Comments

@assambar
Copy link
Contributor

assambar commented Jan 27, 2023

This is a loose list of next steps that we think would improve the python.wasm binary that we build.

Any feedback in comments is greatly appreciated.

@brettcannon
Copy link

Three questions and a request.

Question one: do you know about https://discuss.python.org/c/webassembly/28 ? It's where some announcements and discussions happen around Python and WebAssembly.

Question two: do you know about #Python channel on the WebAssembly Discord? It's the other place some of us participate around Python and WebAssembly.

Question three: Is anyone from VMWare looking to come to PyCon US this year? There may be a WebAssembly summit on Thursday, April 20 for a select number of people (space will be limited).

The request: please feel free to make suggestions upstream to improve how WASI is built, packaged, etc.! I want to get WASI to tier 2 support for CPython, and part of that will be creating WASI builds as part of releases. Trying to make sure those work as best as possible for the community would be great, so help would be appreciated. Discussions around that can happen at discuss.python.org or on a GitHub issue at https://github.com/python/cpython .

@cohix
Copy link

cohix commented Jan 31, 2023

@assambar this is all excellent. I would love to see documentation/examples of how to do bi-directional function calls between the Wasm guest and the host. Specifically, calling a python function from the outside, and allowing python to call imported host functions other than those provided by WASI.

@assambar
Copy link
Contributor Author

assambar commented Feb 2, 2023

@brettcannon

Questions 1 and 2 - Did not know about both discussion channels. Subscribed to both. Many thanks!
Question 3 - I don't know of anyone going , but I will ask around the Wasm enthusiasts here and let them them know if there's anyone going.

The request: please feel free to make suggestions upstream to improve how WASI is built, packaged, etc.! I want to get WASI to tier 2 support for CPython, and part of that will be creating WASI builds as part of releases. Trying to make sure those work as best as possible for the community would be great, so help would be appreciated. Discussions around that can happen at discuss.python.org or on a GitHub issue at https://github.com/python/cpython .

Would love to. Suggestions will definitely come up as we see how people use python.wasm and what more they need. One thing is, we are currently working to also streamline building and publishing of reproducible, versioned WASI builds of popular libraries like libuuid, libz, libsqlite, etc. We will add more as need comes. Our idea is to have those as release assets so that it will be easier to build CPython with more modules out of the box. I will let you know when we have something, if you're interested.

@assambar
Copy link
Contributor Author

assambar commented Feb 2, 2023

@assambar this is all excellent. I would love to see documentation/examples of how to do bi-directional function calls between the Wasm guest and the host. Specifically, calling a python function from the outside, and allowing python to call imported host functions other than those provided by WASI.

Thanks for the feedback @cohix . Added this to the list of things to do in the beginning of this thread. Note that it's not a prioritized list.

@brettcannon
Copy link

I will let you know when we have something, if you're interested.

Yes please! My hope is we can get a "fat" binary build for WASI distributed on python.org with as much statically linked as possible so people can take that python.wasm file and run it anywhere they want.

@gzurl
Copy link
Contributor

gzurl commented Feb 2, 2023

@brettcannon that feedback is very useful. Indeed, we just released two different flavors for PHP 8.2.0. A standard build with common extensions and a slimmed-down version with minimal size. I think we could follow a similar approach with Python.

Also, I believe a specific build focused on ML including the popular packages (NumPy, Pandas, Scikit, etc.) makes a lot of sense for running inference in the edge.

PS: We are waitlisted for a "charla" in PyCon

@codefromthecrypt
Copy link

Is it possible to move some of the blog content into a README alongside the python directories here? While I play devil's advocate sometimes, I'm really interested in what you are doing.

For example, I think you fairly mention in the blog that currently pip isn't usable, which reduces the size of docker images accordingly. Some people are looking at image size alone and missing out that firstly that's not the main goal and also it isn't fair to compare like this. Your blog is far more balanced, yet the points aren't visible in this repo.

https://wasmlabs.dev/articles/python-wasm32-wasi/

So, how about putting some things here? especially the part about pip I think is really important for a README, but I bet folks in the python community have their favorite things to say also.

@bivald
Copy link

bivald commented Feb 15, 2023

Numpy would be awesome to have in this :) Let me know if I can test out anything when you get closer

@assambar
Copy link
Contributor Author

assambar commented Feb 20, 2023

I will let you know when we have something, if you're interested.

Yes please! My hope is we can get a "fat" binary build for WASI distributed on python.org with as much statically linked as possible so people can take that python.wasm file and run it anywhere they want.

Hey @brettcannon we published a "fat" binary with the latest release, relying on wasi-vfs, which turned out to be pretty easy. Take a look at python-3.11.1.wasm and python-3.11.1-wasmedge.wasm in python/3.11.1+20230217-15dfbed

Basically, you need to:

  • link libwasi_vfs.a into the final wasm binary -
  • run wasi-vfs CLI to pack it with the folders you want

Take a look at scripts/build-helpers/wasi_vfs.sh
and how the two functions from there are used in python/v3.11.1/wl-build.sh

@brettcannon
Copy link

relying on wasi-vfs

Are you referring to https://github.com/kateinoigakukun/wasi-vfs ? Or are you referring to something else?

  • run wasi-vfs CLI to pack it with the folders you want

So you're trying to use wasi-vfs to ship files with the WASI binary so it's as self-contained as possible (short of the runtime)? I assume this only works with pure Python files and there isn't some magical dlopen() support in there? And the big bonus compared to freezing the code in with the binary is avoiding the compile step and instead relying on the wasi-vfs CLI to do the joining?

@assambar
Copy link
Contributor Author

assambar commented Feb 28, 2023

Are you referring to https://github.com/kateinoigakukun/wasi-vfs ?

Yep. That one.

So you're trying to use wasi-vfs to ship files with the WASI binary so it's as self-contained as possible (short of the runtime)?

Yep. We just packaged the usr/local/libs folder at / and we got an "all-in-one" python.wasm

I assume this only works with pure Python files and there isn't some magical dlopen() support in there?

Exactly. I'm still looking into how we could do this with modules that have C extensions. Without doing the uncharted dlopen support I'm currently thinking of compiling the extension files those to static wasm libs and then linking them along with libpython into a monolithic python.wasm, then packaging that with wasi-vfs. Naturally, a dlopen approach will be the best option, but I'm not sure how much time it will take to get it right. And it will require a host function.

And the big bonus compared to freezing the code in with the binary is avoiding the compile step and instead relying on the wasi-vfs CLI to do the joining?

I'd rather say the ease of use. It seems people get confused when they have to pre-open the standard library in order to use it.

@brettcannon
Copy link

I'm still looking into how we could do this with modules that have C extensions. Without doing the uncharted dlopen support I'm currently thinking of compiling the extension files those to static wasm libs and then linking them along with libpython into a monolithic python.wasm

This sounds similar to something we are starting to explore a little for VS Code: create an embedded interpreter scenario where we use https://docs.python.org/3/c-api/import.html#c.PyImport_AppendInittab to make extension modules act as built-in modules.

I'd rather say the ease of use. It seems people get confused when they have to pre-open the standard library in order to use it.

For some future WASI release on python.org, I've been thinking of freezing the stdlib into the binary and then letting people mount the stdlib if they want it for tracebacks. That way the simple, easy-to-deploy solution is available there, while you and the rest of the community innovate on nicer, fancier WASI solutions.

@assambar
Copy link
Contributor Author

assambar commented Mar 1, 2023

... create an embedded interpreter scenario where we use https://docs.python.org/3/c-api/import.html#c.PyImport_AppendInittab to make extension modules act as built-in modules.

That's exactly what I plan on doing.

For some future WASI release on python.org, I've been thinking of freezing the stdlib into the binary and then letting people mount the stdlib if they want it for tracebacks.

That would be awesome to have.

@brettcannon
Copy link

That's exactly what I plan on doing.

If you get to do the work in the open, do let us know and we can potentially coordinate or at least share notes (we are still just playing around, so no code to share, but it's being done in the open so we can talk about it, etc.).

@zifeo
Copy link

zifeo commented Mar 5, 2023

Great job for the fat binary, works like a charm!

Add documentation/examples of how to do bi-directional function calls between the Wasm guest and the host.

Is there already some pointers or tests I could use for that? I am especially interested to see how the host (e.g. WasmEdge via Rust) can run the Python guest continuously and trigger specific Python function on "event". Is there already something in that end (instead of relaunch the _start with different arguments)?

@brettcannon
Copy link

I am especially interested to see how the host (e.g. WasmEdge via Rust) can run the Python guest continuously and trigger specific Python function on "event".

Depending on how it's compiled, you could use Python's C API to accomplish this. So basically you could embed the Python interpreter and then call it that way from your own code.

@zifeo
Copy link

zifeo commented Mar 7, 2023

@brettcannon Thanks for the insight, yet not 100% clear on my mind. How would Rust call the Python C API run into the wasm runtime? What is the name/part of C API I should be able to call from Rust?

@codefromthecrypt
Copy link

to make a comparison to another language.. in rust and tinygo, you can export functions so that they can be called outside the scope of wasi. Is there a way to export functions in python (or any interpreter)? Otherwise, my guess is users will have to use some sort of busy loop and pass ins and out via stdio or something.

@brettcannon
Copy link

@zifeo You can call any part of the C API from Rust via unsafe or using something like PyO3. But I think you're after something more dynamic/external than compiling all of Python into your Rust code.

@zifeo
Copy link

zifeo commented Mar 9, 2023

@brettcannon I am not looking into integrating python code into a Rust app, rather using WasmEdge and interact with a running Python VM. This means that I would need a way to call function from the host from within the runtime, but I am sure sure which path to go. Shall I use Python ctypes? But to load what in the runtime?

@assambar
Copy link
Contributor Author

assambar commented Mar 9, 2023

@zifeo we worked with Suborbital on something similar to what you want (I think). It does what @brettcannon suggested.

Take a look at this example where the Python interpreter is wrapped by a Wasm module (written in C) and we have end to end host-to-python and python-to-host functions - https://github.com/vmware-labs/webassembly-language-runtimes/tree/bindings/experiments/se2-bindings. Of course this requires some translation in the Wasm module (which was written in C in this example).

If you want your WASM module to behave like a full-blown Python interpreter on top of functionality like the above, it's just as easy as calling Py_Main or Py_BytesMain in the main method (after initializing your "glue" module for python-to-host calls (called sdk in the above example) - the important line here is PyImport_AppendInittab(SDK_MODULE, &PyInit_SdkModule) which will ensure that sdk is a "builtin" module for the Python interpreter.

@zifeo
Copy link

zifeo commented Mar 9, 2023

@assambar Awesome thanks, exactly what I was looking for. I am close to have reproduced this example with a Rust wrapper. I encountered 2 issues so far:

  • using Rust as host, it seems difficult to patch for the sock_accept export. I understand this is a compatibility issue between WasmEdge and the WASI standard that later came, but I am unsure what steps are needed (or Github issue I should track/open) for that to be solved?
  • using V8 as host, it seems that the stack call size has to be significantly increased to pass the Python initialization. Is this known/expect or shall it be tracked somewhere?

Happy also to move the discussion elsewhere if you feel it does not belong here.

@brettcannon
Copy link

... create an embedded interpreter scenario where we use https://docs.python.org/3/c-api/import.html#c.PyImport_AppendInittab to make extension modules act as built-in modules.

That's exactly what I plan on doing.

It turns out that @kesmit13 has already tested this and got it working with an example extension for Singlestore Labs' WASI build of CPython! He's currently trying to get NumPy to work but running into a circular import issue.

@assambar
Copy link
Contributor Author

  • using Rust as host, it seems difficult to patch for the sock_accept export. I understand this is a compatibility issue between WasmEdge and the WASI standard that later came, but I am unsure what steps are needed (or Github issue I should track/open) for that to be solved?

This is the WasmEdge issue - WasmEdge/WasmEdge#2056. Before they address it, here's what you can quickly do to make your code run on WasmEdge (however, sock_accept will be broken) - patch_wasmedge_wat_sock_accept.sh
. Just note, that this works only on optimized binaries (given the fast and ugly approach we took for the script). Alternatively - you could just try using Wasmtime as a host. It has sock_accept already so you don't need to provide it as host function.

  • using V8 as host, it seems that the stack call size has to be significantly increased to pass the Python initialization. Is this known/expect or shall it be tracked somewhere?

I have not explored this. Maybe just log a separate issue (at best with how you run V8 and also what you import in python). Please note that there's already a know issue with the static libpython, which we built for the example - #79 and I am first looking at it with priority.

@assambar
Copy link
Contributor Author

It turns out that @kesmit13 has already tested this and got it working with an example extension for Singlestore Labs' WASI build of CPython! He's currently trying to get NumPy to work but running into a circular import issue.

This is great! I'm following your discussion since last week, but don't have the cycles to join in debugging yet.

@zifeo
Copy link

zifeo commented Mar 16, 2023

@assambar Thanks for the answer. I managed to get a wrapper in Rust working, will release soon a repo for a full example!

Regarding the network, I have tried importing requests but all attempts using network failed so far (even IP only).

@brettcannon
Copy link

Regarding the network, I have tried importing requests but all attempts using network failed so far (even IP only).

Outbound networking out isn't supported in WASI preview 1, so that will very likely come down to whether your WASI runtime has support for outbound networking.

@zifeo
Copy link

zifeo commented Mar 22, 2023

@brettcannon @assambar Here is the example: https://github.com/metatypedev/python-wasi-reactor. I managed to make everything work thanks to your advices 🙏. I am now waiting on #71 to add some more tests and experiment further with async using Tokio. Happy to have your feedback.

@assambar
Copy link
Contributor Author

@zifeo this looks like a good piece of work! I was happy to learn about Metatype #71 is now fixed and you can get an official WasmLabs build of libpython from https://github.com/vmware-labs/webassembly-language-runtimes/releases/tag/python%2F3.11.3%2B20230428-7d1b259.
Note that for a full-blown Python application to work you need a decent stack size. The suggested C linker options can be found in lib/wasm32-wasi/pkgconfig/libpython311.pc inside the tarball.

@zifeo
Copy link

zifeo commented Jun 6, 2023

@assambar Thanks. We are now updating the reactor to support more than uniquely registering lambdas.

With WasmEdge release 0.12.1 it seems that network is finally available. Could you elaborate what is required for pip to work, or at least to have a way to install python vanilla dependencies (would be great to also have a few hint on how to compile native-based lib)?

@brettcannon
Copy link

@zifeo the command to have pip install pure Python dependencies only can be found in https://snarky.ca/testing-a-project-using-the-wasi-build-of-cpython-with-pytest/ , but you will probably need to do that outside of WebAssembly (I don't know how WasmEdge has "networking", but my guess is it isn't complete enough for CPython's socket support to work).

As for native dependencies, you will need to compile that into your Python binary as built-in extensions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants