Python support building extension modules in C. These extension modules are usually linking into a shared library (so
) and then loaded during runtime using dlopen
.
A good example on how this can be done is here.
Interesting to note is that the compilation of these C
modules is usually done using a 'setup.py` script. For the above mentioned example that script looks like this:
from distutils.core import setup, Extension
def main():
setup(name="fputs",
version="1.0.0",
description="Python interface for the fputs C library function",
author="Dirk Baeumer",
author_email="dirk.baeumer@gmail.com",
ext_modules=[Extension("fputs", ["fputsmodule.c"])])
if __name__ == "__main__":
main()
This script takes care of all the right setup of the compiler and linker tool chain.
To support C extension modules under WASM we need to find answer to the following problems:
- how to build the code. The setup.py script can not be used in that form since running python in a WebAssembly will not give us access to the WASM compiler tool chain to use.
- how to load the code. Shared libraries and
dlopen
are not supported under WASM-WASI and only support with limitation under Emscripten.
Best option is to use the wasm-env
script from CPython's Tools\wasm
folder to setup the same environment as CPython does when building python. We also must make sure to use the pyconfig.h
from the WASM Python build. The other header files can be used from a normal Python Dev install, however we need to ensure that the versions match to the WASM Python build.
To ease this we should start considering bundling the include files into a separate dev tar and make it available for download together with the corresponding WASM builds.
When it comes to shared libraries we need to distinguish two areas:
- the compiler and linker tool chain to produce shared libraries.
- WASM support to load a shared library and to do the dynamic linking.
A proposal for the toolchain exists here. It specifies which object code needs to be emitted to build relocatable WASM libraries.
I tested the Emscripten tool chain and the latest WASI-SDK and both are able to emitted the proposed format. However both claim the following
wasm-ld: warning: creating shared libraries, with -shared, is not yet stable
Loading a shared library either during start of the main wasm or using dlopen
is currently not supported in WASI-SDK / LLVM. The WASI-SDK doesn't ship a dlfcn.h
header hence dlopen
can't be used.
Emscripten offers support for dynamic libraries however it is limited compared to the support you usually have on an OS like Linux. Emscripten supports dynamic libraries as side modules and a main module:
- Main modules, which have system libraries linked in. A project should contain exactly one main module.
- Side modules, which do not have system libraries linked in.
The actual linking between side and main modules happens in JavaScript in Emscripten.
For Python extensions in C this would mean that the python.wasm
would be the main module and the native extension would be a side module. Since the side module needs to get system libraries from the main module (which need to be linked in statically) we need to know upfront when compiling python.wasm
which functions a possible side module needs. For simple side modules this might work but will very likely fail if the native Python extension (side module) is more complex.
Since we need to curate native extensions at the beginning and need to make sure that all system library functions are available at runtime we could alternatively link them statically.
In the native example from above the native code resides in a file called fputsmodule.c
. Here would be the steps to link a new python.wasm containing the fputs
native module:
-
compile
fputsmodule.c
to a WSAM obj file:clang -c fputsmodule.c -o fputsmodule.o
-
now we need to create a new
python.wasm
using the following command:clang -z stack-size=524288 -Wl,--stack-first -Wl,--initial-memory=10485760 -lwasi-emulated-signal -lwasi-emulated-getpid -lwasi-emulated-process-clocks -o python.wasm Programs/python.o libpython3.12.a Modules/_decimal/libmpdec/libmpdec.a Modules/expat/libexpat.a fputsmodule.o -Wl,--export=PyInit_fputs
I extracted parts of the command from the Python makefile. This produces a
python.wasm
with an exported entryPyInit-fputs
which we could then call when thefputs
module gets imported (e.g.import fputs
)
However such an approach would require some modification to
- the published package. We would need to include the libraries
libmpdec.a
andlibexpat.a
- the way Python loads a native module. Ideas are (not sure how feasible they are in Python):
- we have special code in Python for
wasm-wasi
that directly reaches to that function - we provide a library with
dlopen
anddlsym
with a special implementation that calls the exported WASM function via JavaScript. - we treat the native modules as builtin and initialize them with other builtin features.
- we have special code in Python for
My preferred solution would actually be our own dlfcn.h
implementation for that special purpose.