-
Notifications
You must be signed in to change notification settings - Fork 779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allows to introspect Python modules from cdylib: first step #3977
base: main
Are you sure you want to change the base?
Conversation
Thanks for moving this forward! The idea of using custom data sections is new to me. I see the upsides of it, though I am slightly worried by the extra complexity of needing to worry about linker details in yet another way.
I agree that having the macros generate file(s) is unlikely to be the right solution 👍
For the library which converts the metadata to In fact, I wonder if for option (3) explored here then the using a test to generate and update the
If we agree that using a test to update stubs is a good solution, then I think the choice between runtime code like (2) and data segments like (3) is probably just influenced by whatever is easier for us to implement. We might even be able to swap back and forth between these two options as an implementation detail as we learn. |
Yes! To have played a bit with both, runtime code like (2) is way easier (the difference between this MR and #2454 is quite significant). If I try to summarize the pros of each approach: Approach 3: add introspection to the cdylib and let maturin write the stubs on build:
Approach 4: use a test to write/update stubs:
|
You make a very good point that automated tests to update a One thing that I expect is that Overall I don't have a good sense of whether option 3 or 4 is better. In an ideal world we might offer both options. Which one do you think would meet your needs better at present? Maybe we start by implementing that and we learn a lot by doing so! |
I would love to avoid people to handwrite customization in stubs because it makes automatically updating the stubs when Rust code is changed very hard. Imho automatically updating stubs to reflect changes in the Rust code is the main value proposition of auto-generating stubs in the first place. An idea: Add entry points in PyO3 macros to extend the stubs. For example (rought idea, not sure about the actual details): #[pymodule]
#[py(extra_stub_content = "
K = VarType('T')
V = VarType('V')
")]
mod module {
#[pyclass(stub_parent = [collections::abc::Mapping::<K,V>])]
struct CustomDict;
#[pymethods]
impl CustomDict {
#[pyo3(signature = (key: K), return_type = V | None)]
fn get(key: &Bound<'_, PyAny>) -> Option<Bound<'_, PyAny>> {
}
}
} would generate K = VarType('T')
K = VarType('V')
class CustomDict(collections.abc.Mapping[K,V]):
def get(key: K) -> V | None: ... This way stubs would stay auto generated but can be improved by the author. A possible way to mix options 3 and 4:
|
Agreed that having the proc macros be able to collect all the necessary information would be nice. I think only time will tell whether they can meet all user needs! I'm slightly wary of coupling to cc @messense do you see any concerns with adding this to So what's next steps here? Do you want me to start reviewing this code, or will you push more first? Regarding the data sections, I happened to hear yesterday that UniFFI's proc macros can do something similar about shipping definitions in the shared library, so it might be interesting to look at / ask them how that was implemented. |
No concern, I think a |
Thank you!
Yes! My hope is to cover as many as possible.
Additionaly to
Thank you! I'm going to have a look at it.
I think the current draft already shows the relevant direction, a very high level code review to check if it's going in the good direction would be welcome. Maybe wait for me to have a look at UniFFI, I might change a bit this MR if I find interesting things there. Thank you! |
This is very exciting, looking forward to being able to generate type stubs! Currently we have this lengthy and hard to maintain Python script for doing so, which we have to update by hand: https://github.com/Chia-Network/chia_rs/blob/main/wheel/generate_type_stubs.py This would be a major improvement. Happy to help out however I can (testing, implementation, whatever) as time allows, to hopefully get this out the door 😄 |
@Rigidity Thank you! I plan to work on this MRs to get the basics done. Then there will be a lot of features to incrementally add on it (support for all PyO3 features...) so help will coding and testing will be much welcome! |
e82fc35
to
cb4cfa7
Compare
Sorry for the very long reaction delay (a lot of priorities + vacations).
I had a look at uniffi, they basically use the same approach as us: embedding the metadata in the binary and then parsing the binary using the same @davidhewitt If you have time, may you have a quick look at the MR to see if the global design goes into a good direction? If yes, I will fix a lot of shortcut I took and get the MR ready for review. |
ed37f96
to
2f0d8dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yes I'm happy with us proceeding with this approach!
I'm definitely convinced by technical direction here for the generation process; the main risk I still see is how to give users the full power to customise the stubs with generic args etc.
I think we can learn by that piece-by-piece as we proceed.
I think so too; I also imagine we might want to strip the custom link section after the stubs have been extracted, it feels like it's probably easier to do that by having a distinct section. |
Thank you! I agree on the risk. My guess is that we will introduce a set of macro arguments for that, but getting them right won't be easy. |
6ee4940
to
82e672f
Compare
Isn't type stub generation only something that would have to happen on a development host machine? Maybe cross platform support would be fine to be somewhat limited as long as it's not done in regular builds |
88907c8
to
f2d1a3c
Compare
The challenge with this approach is that it is very hard in Rust to know what are all the elements (class, function...) that might be exposed to any platform because code disabled for a plateform is just not compiled and its macros are not evaluated. Hence, if a library has some WASM-specific code, it is very hard to build introspection data for it while doing a Linux build. Hence, I went into the direction where we would generate stubs for each plateform at compile time. Distribution-side, it would mean that the source |
f81580f
to
08c58e0
Compare
0d0cd45
to
d2375f0
Compare
d2375f0
to
0d78198
Compare
@Tpt Do you need help with implementing/testing this? Or you pretty much done and waiting for review? |
@yogevm15 A basic MVP (generating stubs with just the list of classes and functions) is done. I would love to get review on it before adding more features to avoid a huge MR. I already got some nearly finished code on my laptop for function signatures but a lot of work still needs to be done for class instrospection. Help will be very welcome in this area when this MR is merged |
Sorry for the huge delay from me on this one. I am generally beginning to catch up with the backlog from paternity leave; I will try to review this ASAP in the coming days so that we can unblock and move forward. |
@davidhewitt Congratulations! Take your time, babies grow up so fast. Thank you! |
@Tpt I was trying to test your feature, but I couldn't make it work with debug builds:
I am on macOS. Code to reproduceuse pyo3::prelude::*;
#[pyclass]
struct DummyClass {}
#[pymethods]
impl DummyClass {
#[staticmethod]
fn get_42() -> PyResult<usize> {
Ok(42)
}
}
#[pymodule]
pub mod pyo3_pure {
#[pymodule_export]
use super::DummyClass;
} I added a use std::env;
use std::path::PathBuf;
use pyo3_introspection::{introspect_cdylib, module_stub_files};
fn main() {
let binary = PathBuf::from(env::args().nth(1).unwrap());
let module = introspect_cdylib(binary, "pyo3_pure").unwrap();
let actual_stubs = module_stub_files(&module);
dbg!(actual_stubs);
} Then I take the output wheel of |
…eneration-static # Conflicts: # pyo3-macros-backend/Cargo.toml # pyo3-macros-backend/src/pyclass.rs # pyo3-macros/Cargo.toml # pytests/src/lib.rs
a99ac92
to
e8f4bc1
Compare
4feb8a3
to
f68b770
Compare
f68b770
to
90f8e90
Compare
This is a first step to introspect Python modules built by PyO3
A missing piece in the story listed in #2454 is how tools like Maturin move the introspection information generated how by PyO3 into to type stubs files included in the built wheels.
I see three approaches for it:
pyo3-macros
generate a file with the stubs after having processed all macros of the crate. This has the advantage of being self-contained in the crate but falls short in cases like python classes declared in a crate but exposed in an other crate: there is no guarantees that proc macros of a crate and its dependencies are compiled in the same process and that proc macros will still be able to write files in the future (like with proposal to run them a WASM sandbox).__pyo3_stubs_MODULE_NAME
function However, for the build system to execute it, a compatible Python interpreter must be present to link with and a compatible CPU or VM to run it, making generation when doing cross-compilation very hard. I guess it's what Python Interface (.pyi) generation and runtime inspection #2454 was heading toward.Architecture:
pyo3_data0
section that contains a JSON "chunk" with the introspection data. Code inpyo3-macros-backend/src/introspection.rs
. I had to do some bad hack to generate the segments properly via Ruststatic
elements.PYO3_INTROSPECTION_ID
constants, allowing the code building the JSON chunk to get the global id of eg. a classC
viaC::PYO3_INTROSPECTION_ID
. This allows chunks to refer to other chunks easily (eg. to list module elements). A bad hack is used to generate the ids (TypeId::of
would have been a nicer approach but is not const on Rust stable yet).0
at the end ofpyo3_data0
is a version number, allowing breaking changes in the introspection format.pyo3-introspection
crate parses the binary usinggoblin
(library also used by Maturin), fetches all thepyo3_data0
segments (only implemented for macOS Match-O in this experiment), and builds nice data structures.pyo3-introspection
crate would implement ato_stubs
function converting the data structures to Type stubs.pyo3-introspection
has an integration tests doing introspection of thepyo3-pytests
library.Current limitations:
FromPyObject::type_input
into an associated constant or a const function and similarly forIntoPy::type_output
. This is mandatory in order to allow to make use of them in thestatic
values added to the generated binary.