Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add path option for Python source. #335

Closed
wants to merge 1 commit into from
Closed

Conversation

sebpuetz
Copy link

Forcing the Python package to be in the project root leads to difficulties testing and evaluating the package:

  • it's not possible to import the installed mixed package, import my_package will import from the folder in the project instead of the install under site-packages.
  • running pytest from the project root suffers the same consequences: tests will import from the folder in the project instead of the installed version.
  • consequently, only in-place builds with maturin develop can be tested in the current setup.

Not sure whether I missed something or if there are reasons that make this solution sub-optimal, looking forward to some feedback!

@sebpuetz sebpuetz force-pushed the py-src branch 2 times, most recently from 2480206 to 658b25f Compare July 26, 2020 14:53
@sebpuetz
Copy link
Author

Not sure what the failing CI on Mac is about, it reports that serde is missing? Shouldn't be related to this PR...

@ijl
Copy link
Contributor

ijl commented Jul 27, 2020

I think sdists require an option in pyproject.toml that is read in get_config_options() and made a command line argument.

@sebpuetz
Copy link
Author

Not sure what you mean with that, maturin sdist seems to work and includes package_dir={"": "py_src"} from setup.py.

Do you mean something different?

@ijl
Copy link
Contributor

ijl commented Jul 28, 2020

You could include setup.py, but maturin doesn't write or read setup.py. Maybe you have specified a call to maturin there, pip prefers it to pyproject.toml, and are not seeing the normal PEP 517 flow? I am thinking of running maturin sdist without a setup.py file and then pip install ./target/wheels/*.tar.gz. This should fail when trying to compile the contents of for example ./my_package rather than what is specified by py-src. It would need some configuration key/value in pyproject.toml that is read by get_config_options() and passed to a flag for the call to maturin, e.g. in the end maturin pep517 build-wheel -i python --py-src custom in https://github.com/PyO3/maturin/blob/master/maturin/__init__.py#L51-L59

@sebpuetz
Copy link
Author

Thanks for the explanation, I'll look into it some time tomorrow!

@davidhewitt
Copy link
Member

@sebpuetz thanks for this. Might it be worth also adding this option to the toml configuration in [package.metadata.maturin] ?

Also, I think CI issue now fixed on master if you rebase 👍

@sebpuetz sebpuetz force-pushed the py-src branch 5 times, most recently from 1bd3483 to 2cb5c3e Compare July 29, 2020 10:45
@sebpuetz
Copy link
Author

Thanks for the feedback, the directory for the Python source is now configurable through pyproject.toml and through the --py-src argument on the maturin subcommands. I changed the pyo3-mixed test crate to the nested layout and it seems to work as expected:

$ pip install . --no-build-isolation

installs directly from the directory and

$ maturin sdist
$ pip install target/wheels/*.tar.gz --no-build-isolation

works, too.

--no-build-isolation is required since otherwise maturin is fetched from pypa instead of using the locally installed maturin version. Should be possible to drop this flag once the changes are released.

@sebpuetz
Copy link
Author

Is there something other than the conflicting files that needs to be addressed to merge this PR?

@davidhewitt
Copy link
Member

I think the overall design question is one for @konstin, so I would prefer wait for their feedback on this proposal.

Questions in my mind that I'm not sure what the right answer is:

  • Do we need both --py-src and configuration through TOML? It keeps the project simpler if we only have one, and I think TOML configuration is what most maturin configuration uses so I prefer to just have TOML.
  • Should it be in pyproject.toml or Cargo.toml? I think most maturin config is in the Cargo.toml.
  • Even going further, should we be supporting custom project layouts at all? Sure, it gives people flexibility, but in the same fashion cargo does not allow custom project layouts, and that has led to the Rust ecosytem having very uniform projects. IMO this makes it easy for me to open up new Rust projects and start hacking. I appreciate that there is an issue here testing specific project layouts, but I wonder if this is something we should solve with e.g. maturin test subcommand rather than opening up the configuration. Or alternatively, we could add a "new" standard mixed layout where all python code lives inside e.g. a python/ subdirectory, which we encourage users to migrate towards.

Sorry I have done a bit of a u-turn when thinking about this more :(

@sebpuetz
Copy link
Author

sebpuetz commented Aug 1, 2020

  • Even going further, should we be supporting custom project layouts at all? Sure, it gives people flexibility, but in the same fashion cargo does not allow custom project layouts, and that has led to the Rust ecosytem having very uniform projects. IMO this makes it easy for me to open up new Rust projects and start hacking. I appreciate that there is an issue here testing specific project layouts, but I wonder if this is something we should solve with e.g. maturin test subcommand rather than opening up the configuration. Or alternatively, we could add a "new" standard mixed layout where all python code lives inside e.g. a python/ subdirectory, which we encourage users to migrate towards.

There's good reason to split up the project structure like this. Importing from the project root will always import from the package in the project rather than from site-packages (unless you remove the cwd from the path).

$ maturin build -i python
$ pip install target/wheels/pyo3_mixed-2.1.1-cp37-cp37m-manylinux1_x86_64.whl
$ python test_pyo3_mixed.py
Traceback (most recent call last):
  File "test_pyo3_mixed.py", line 3, in <module>
    import pyo3_mixed
  File "/data/rust_projects/maturin/test-crates/pyo3-mixed/pyo3_mixed/__init__.py", line 2, in <module>
    from .pyo3_mixed import get_21
ModuleNotFoundError: No module named 'pyo3_mixed.pyo3_mixed'

In a setting with pytest and with a tests folder containing a single test, trying to import the package:

$ maturin build -i python
$ pip install target/wheels/pyo3_mixed-2.1.1-cp37-cp37m-manylinux1_x86_64.whl
$ pytest
[...]
Traceback:
/home/seb/.pyenv/versions/3.7.7/lib/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_tests.py:1: in <module>
    import pyo3_mixed
pyo3_mixed/__init__.py:2: in <module>
    from .pyo3_mixed import get_21
E   ModuleNotFoundError: No module named 'pyo3_mixed.pyo3_mixed'

Although, this can be fixed by setting pytest --import-mode=append, which puts the cwd to the end of the path rather than beginning, thus yielding to the site-packages install.

If you want to drop into a interactive session and quickly check something, you need to navigate out of the project root or need an editable install. python -c "import pyo3_mixed" fails without that.

There's a decent blog-post about why the src project layout is preferable: https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure cited by pytest docs which offer some more reasoning why different layouts can be preferable https://docs.pytest.org/en/stable/goodpractices.html.

Imo, changing the folder structure let's you avoid quite a number of annoyances.

@konstin
Copy link
Member

konstin commented Aug 4, 2020

First of all thank you for the pull request and sorry I took so long to respond! Also thanks to @ijl and @davidhewitt for reviewing this!

  • it's not possible to import the installed mixed package, import my_package will import from the folder in the project instead of the install under site-packages.

  • running pytest from the project root suffers the same consequences: tests will import from the folder in the project instead of the installed version.

  • consequently, only in-place builds with maturin develop can be tested in the current setup.

I actually thought of that as feature, as this way you can edit python code and don't need to reinstall with maturin, or get outdated code from site-packages, but I guess that really depends on the workflow and I feel like we should definitely support module-not-in-root layouts such as the src-layout. (I actually thought about using the src/module_name when adding that feature, however I decided against it since it would confusing that the top level python module looks like a submodule of the rust tree)

Do we need both --py-src and configuration through TOML? It keeps the project simpler if we only have one, and I think TOML configuration is what most maturin configuration uses so I prefer to just have TOML.

Should it be in pyproject.toml or Cargo.toml? I think most maturin config is in the Cargo.toml.

I would prefer a key called python_source in Cargo.toml. I don't think we need a cli argument as the python source directory shouldn't be different for different invocations. I've put most of the config it in Cargo.toml since maturin already use the cargo metadata in there and because it's mandatory, while you can do without pyproject.toml if you don't build source distributions.

One thing that I might have just missed in the code is how the name mismatch is handled: If you package a module called my_module from some_dir/alt_python_source, is the directory renamed to my_module in the wheel?

  • Even going further, should we be supporting custom project layouts at all? Sure, it gives people flexibility, but in the same fashion cargo does not allow custom project layouts, and that has led to the Rust ecosytem having very uniform projects. IMO this makes it easy for me to open up new Rust projects and start hacking. I appreciate that there is an issue here testing specific project layouts, but I wonder if this is something we should solve with e.g. maturin test subcommand rather than opening up the configuration. Or alternatively, we could add a "new" standard mixed layout where all python code lives inside e.g. a python/ subdirectory, which we encourage users to migrate towards.

A key problem is that there currently is no consensus about package structure in python outside from the layout of a wheel (this is also what makes dependency resolution so hard in python: sdists don't have reliable metadata). There are discussions about standardizing metadata in pyproject.toml and a standard way to specify dependencies, but both seem to have stalled.

Another key difference is that unlike cargo run, python-the-command does not have a concept of a project or packages. Before PEP 517, the packaging standard was mostly "whatever setuptools does", and many packaging-related things are still underspecified.

I completely agree that a stricter layout is easier to use (and also less error-prone), which is why maturin has rather strong constraints, but in this case I think it's worth supporting an extra mode.

I'd also really like to offer proper editable installs which would make this a lot smoother from a user perspective, but unfortunately they are neither properly documented nor standardized. Another option for at least the shared library would be symlinks, however symlinks have issues on windows.

Although, this can be fixed by setting pytest --import-mode=append, which puts the cwd to the end of the path rather than beginning, thus yielding to the site-packages install.

If you want to drop into a interactive session and quickly check something, you need to navigate out of the project root or need an editable install. python -c "import pyo3_mixed" fails without that.

There's a decent blog-post about why the src project layout is preferable: https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure cited by pytest docs which offer some more reasoning why different layouts can be preferable https://docs.pytest.org/en/stable/goodpractices.html.

As I wrote above using src would confusing because it's already occupied by cargo/rust, but would an automatic fallback from my_module to python/my_module also suffice?

@ijl
Copy link
Contributor

ijl commented Aug 13, 2020

Using python_source seems like a pragmatic solution to me. It seems more likely to work than waiting on Python packaging for standardization or even then trying to enforce a standardized layout on existing projects. Multilingual projects' layout won't be as clear as a Rust-only project--that seems fine? This would solve the problem of not being able to ship type annotations alongside the shared library so even with future todos I would find this useful as is.

@konstin
Copy link
Member

konstin commented Aug 13, 2020

This would solve the problem of not being able to ship type annotations alongside the shared library so even with future todos I would find this useful as is.

Could you elaborate? (I don't have experience shipping type annotations, I've only seen that orjson has a orjson.pyi file)

@ijl
Copy link
Contributor

ijl commented Aug 13, 2020

Type annotations will only be read in module directories so it requires a mixed project. If you just include a pyi alongside the shared object, it will be at site-packages/project.pyi and not found. But site-packages/project/project.pyi is used assuming packages/project/py.typed also exists. I don't find it practical to use a mixed project right now because build scripts and CI would need some hack to move ./project to avoid the local directory taking precedence over what's in site-packages and causing ModuleNotFoundError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants