Skip to content

Allow statically linked Python like PythonCall?  #496

Open
@marius311

Description

@marius311
Contributor

I'm not an expert and don't know the internals, but is there a reason PyCall can't do whatever PythonCall / juliacall does that lets the user use any Python executable, including ones with a statically linked libpython? Is there anything preventing what they're doing to be used here? A probably related question posted here: JuliaPy/PyCall.jl#988

Activity

oschulz

oschulz commented on Jul 17, 2022

@oschulz

That would be so awesome!

mkitti

mkitti commented on Nov 8, 2022

@mkitti
Member

Here is cjdoris's response to marius311 on the topic:
https://discourse.julialang.org/t/ann-pythoncall-and-juliacall/76778/16

I’ve encountered that issue before in pyjulia but don’t actually know its cause.

I imagine the difference is in how the packages load libpython. In JuliaCall, we pass ctypes.pythonapi._handle to PythonCall, which is a pointer to an already-open libpython. I assume PyJulia/PyCall opens libpython itself.

Indeed, he's right:
https://docs.python.org/3/library/ctypes.html

ctypes.pythonapi
An instance of PyDLL that exposes Python C API functions as attributes. Note that all these functions are assumed to return C int, which is of course not always the truth, so you have to assign the correct restype attribute to use these functions.

$ ldd `which python`
	linux-vdso.so.1 (0x00007ffe269cb000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fcf5c547000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fcf5c53f000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fcf5c537000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fcf5c52f000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fcf5c447000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fcf5c21f000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fcf5c917000)

$ `which python`
Python 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:06:46) [GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes
>>> ctypes.pythonapi
<PyDLL 'None', handle 7f106618a2e0 at 0x7f10655c2980>
oschulz

oschulz commented on Nov 8, 2022

@oschulz

@mkitti so PyCall/pyjulia could do that as well?

mkitti

mkitti commented on Nov 8, 2022

@mkitti
Member

I think so. We technically just need the pointer.

oschulz

oschulz commented on Nov 8, 2022

@oschulz

Oh that would be awesome! I guess Packages like PySr (@MilesCranmer), diffeqpy (@ChrisRackauckas) and so on would profit a lot from that as well.

mkitti

mkitti commented on Nov 23, 2022

@mkitti
Member

I would like to review the situation here.

Part of the issue is that pyjulia is only half of the equation here. The other half is PyCall.jl.

In JuliaPy/PyCall.jl#612, they were trying to load the python executable as libpython due to PIE (Position Independent Executables).

In the linked comment above, @cjdoris demonstrates that we do not need to load python executable or libpython since we could just reuse ctypes.pythonapi._handle as is done in juliacall / PythonCall. In juliacall, the pointer is passed through an environment variable.

How is ctypes.pythonapi._handle loaded when Python is statically linked to libpython?

Looking into ctypes we see that pythonapi is set to PyDLL(None). The name argument and the _name field of PyDLL, a subclass of CDLL is set to None.

>>> import ctypes
>>> ctypes.pythonapi
<PyDLL 'None', handle 7f084713e2e0 at 0x7f08464d3e10>
>>> ctypes.pythonapi._name
>>> ctypes.pythonapi._name == None
True

_name is subsequently passed to _dlopen which on POSIX systems is just libdl C routine dlopen.

If we look at the man page for dlopen(3) we see this call to dlopen will return a handle to the executable.

If filename is NULL, then the returned handle is for the main program.

Can we obtain the pythonapi pointer handle with dlopen in Julia?

This suggests that we can use dlopen from Julia to obtain the same pointer. While there are a few layers of indirection involved, passing an empty string to Julia's Libdl.dlopen appears to work.

# Start from ipython
In [1]: import ctypes

In [2]: hex(ctypes.pythonapi._handle)
Out[2]: '0x7f054fcae2e0'

In [3]: from julia.api import LibJulia

In [4]: api = LibJulia.load()

In [5]: api.init_julia()

In [6]: api
Out[6]: <julia.libjulia.LibJulia at 0x7f054c735fd0>

# Launch Julia REPL from Python
In [7]: api.jl_eval_string(b"""
   ...: import REPL;
   ...: term = REPL.Terminals.TTYTerminal("dumb", stdin, stdout, stderr);
   ...: repl = REPL.LineEditREPL(term, true);
   ...: REPL.run_repl(repl);
   ...: """)
julia> using Libdl

julia> python_ptr = dlopen("")
Ptr{Nothing} @0x00007f054fcae2e0

We see above that the pointer from ctypes.pythonapi._handle is exactly the same pointer we obtain by invoking Libdl.dlopen("") in Julia.

Can we obtain symbols from this pointer?

julia> Py_IsInitialized = dlsym(python_ptr, :Py_IsInitialized)
Ptr{Nothing} @0x0000555d21fd3890

julia> ccall(Py_IsInitialized, Cint, ())
1

julia> Py_GetVersion = dlsym(python_ptr, :Py_GetVersion)
Ptr{Nothing} @0x0000555d21fe0580

julia> ccall(Py_GetVersion, Cstring, ()) |> unsafe_string
"3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]"

Concluding statements

We can obtain ctypes.pythonapi._handle by calling dlopen("") in Julia when started from Python. For juliacall an environment variable may not have be used to transmit the pointer. For pyjulia and PyCall.jl this simplifies the method to obtain pythonapi pointer.

cjdoris

cjdoris commented on Nov 23, 2022

@cjdoris

That's cool!

I just took a quick look from JuliaCall and it's true dlopen("") returns the same handle on Linux, but it throws an error on Windows:

could not load library ""
The parameter is incorrect.

Plus the behaviour of dlopen("") is undocumented, so personally I'm steering clear of it.

mkitti

mkitti commented on Nov 23, 2022

@mkitti
Member

While I agree that dlopen("") is undocumented at the Julia API level, it does correspond to the documented behavior at thr C API level.

The use of ctypes.pythonapi._handle is also equally undocumented. The underlying mechanism basically depends on the same behavior.

cjdoris

cjdoris commented on Nov 23, 2022

@cjdoris

Actually ctypes.pythonapi is documented to be a PyDLL and PyDLL._handle is documented to be the system handle - in this case the underscore is not indicating an internal attribute, but is to avoid name clashes with symbols in the DLL.

mkitti

mkitti commented on Nov 23, 2022

@mkitti
Member

You're right, I concede the point.

https://docs.python.org/3/library/ctypes.html#ctypes.PyDLL._handle

Also dlopen("") does not work on macOS and really should be dlopen(C_NULL) which doesn't work. See JuliaLang/julia#22318. One would have to do

ccall(:jl_load_dynamic_library, Ptr{Cvoid}, (Ptr{Nothing},UInt32,Cint), C_NULL, RTLD_GLOBAL, Cint(1))

That does work.

xref: JuliaLang/julia#22318

mkitti

mkitti commented on Nov 23, 2022

@mkitti
Member

On macOS ctypes.pythonapi._handle is 0xfffffffffffffffe.

In [1]: import ctypes

In [2]: ctypes.pythonapi._handle
Out[2]: 18446744073709551614

In [3]: hex(ctypes.pythonapi._handle)
Out[3]: '0xfffffffffffffffe'

This is actually the value of RTLD_DEFAULT: https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/dlsym.3.html

If dlsym() is called with the special handle RTLD_DEFAULT, then all mach-o macho
o images in the process (except those loaded with dlopen(xxx,
RTLD_LOCAL)) are searched in the order they were loaded. This can be a
costly search and should be avoided.

cjdoris

cjdoris commented on Nov 23, 2022

@cjdoris

🤯

I've never actually tried JuliaCall on Mac. I wonder if it works. I should really set up tests and CI.

Edit: It works fine! And indeed the handle is that special value.

That very last sentence ("this can be a costly search") may explain why loading in ~100 symbols takes so long in PythonCall (~1sec), one reason why PyCall is much faster to load.

mkitti

mkitti commented on Nov 23, 2022

@mkitti
Member

On macOS, you can just dlopen the executable. At the moment the timing does not look terrible.

In [1]: from julia.api import LibJulia

In [2]: api = LibJulia.load()

In [3]: api.init_julia()

In [4]: api.jl_eval_string(b"""
   ...: import REPL;
   ...: term = REPL.Terminals.TTYTerminal("dumb", stdin, stdout, stderr);
   ...: repl = REPL.LineEditREPL(term, true);
   ...: REPL.run_repl(repl);
   ...: """)

julia> python_path = ccall(:_dyld_get_image_name, Cstring, (UInt32,), 0) |> unsafe_string
"~/miniforge3-x86_64/envs/pyjulia_test_x86_64/bin/python3.11"

julia> python_handle = dlopen(python_path)
Ptr{Nothing} @0x000000021ba297e0

julia> Py_IsInitialized = dlsym(python_handle, :Py_IsInitialized)
Ptr{Nothing} @0x0000000104d78020

julia> ccall(Py_IsInitialized, Cint, ())
1

julia> @btime dlsym(python_handle, :Py_IsInitialized)
  253.612 ns (1 allocation: 16 bytes)
Ptr{Nothing} @0x0000000104d78020

julia> RTLD_DEFAULT = Ptr{Nothing}(0xfffffffffffffffe)
Ptr{Nothing} @0xfffffffffffffffe

julia> @btime dlsym(RTLD_DEFAULT, :Py_IsInitialized)
  270.565 ns (1 allocation: 16 bytes)
Ptr{Nothing} @0x0000000104d78020

PyCall.jl does a lot of symbol loading during precompilation. That is also going to make it difficult for using this pointer though and is also why it doesn't work with a statically linked python executable unless compiled_modules = false (e.g. no precompilation).

My thought is that this could benefit from a lazy symbol loading scheme such as the one I put into GR.jl:
https://github.com/jheinen/GR.jl/blob/db3e5f53738be892b23317d673179a32b0e50910/src/funcptrs.jl#L74-L86

cjdoris

cjdoris commented on Nov 24, 2022

@cjdoris

My thought is that this could benefit from a lazy symbol loading scheme such as the one I put into GR.jl:
https://github.com/jheinen/GR.jl/blob/db3e5f53738be892b23317d673179a32b0e50910/src/funcptrs.jl#L74-L86

Yeah thanks, I've got something similar in a branch somewhere....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @oschulz@marius311@cjdoris@mkitti

        Issue actions

          Allow statically linked Python like PythonCall? · Issue #496 · JuliaPy/pyjulia