-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace setup.py with pyproject.toml #9021
Conversation
@jameslamb You might be interested in this work. |
Thanks for tagging me! I'm very interested in this and happy to provide a review or answer questions if you want some extra help. For reference, in LightGBM I'm pursuing using |
Some thoughts here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity, is it possible that we split the XGBoost package into two like what the conda forge package does? One as libxgboost and another one as (py)xgboost.
CMake is now on pypi, which has two implications. Firstly it's a C++ project with only a CLI interface. The mentioned libxgboost would be similar: a C++ project with only a C interface. Secondly, I am not sure if there is a distinction between build-time and run-time dependency for toml projects. If so then we can have cmake as a build-time dependency for the source distribution.
Lastly, it's a little bit weird that the packaging logic might be installed and a user can import it. Not really an issue, just feels unnature.
There is! This is one of the nice benefits of the new backends... you don't have to have something like Also on the topic of CMake being available on PyPI, two very important things to consider:
|
It's worth noting that this is in fact one of the motivating reasons for the creation of the modern build infrastructure, namely PEP 517 as previously mentioned as well as PEP 518, which defines the
The |
@vyasr Thanks for your input. I will try to remove boilerplates from the custom backend by re-using logic from other backends. @trivialfis I suspect that libxgboost-only package on PyPI will not be terribly useful. The reason is that different virtual environment managers for Python would install libxgboost.so in different locations, so it will be difficult for other packages to link with libxgboost. One of the main benefit of Conda is that native libs get installed in standard locations, so that other packages can easily find them. |
It's not necessary to be useful, just a way for us to split c++ packaging from the Python package. |
@trivialfis Let's not split the packaging. It will introduce headaches, since different virtual env managers for Python install native libs in different locations. We don't want to make |
@vyasr I managed to use @trivialfis The |
@trivialfis Editable installation is now supported. Run: |
I am removing
I tried a couple things to salvage the error, but none worked.
I regret adding Conclusion: Let's not bundle |
note to myself: need to update the release script. |
@trivialfis I updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good to me. Thank you again for working on this! I'm not entirely sure how all the tools interact with each other. Seems complicated as there's code generation during build/install. Would be great if others can take a look as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I'm not too familiar with hatch, so I needed to dig a bit through that backend to review some of the more internal components, but this approach seems pretty reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! There are a couple of outstanding comments, but almost everything has been addressed and nothing is left that I would block merging on. Great to see this work here!
python-package/packager/pep517.py
Outdated
if locate_local_libxgboost(TOPLEVEL_DIR, logger=logger) is None: | ||
raise AssertionError( | ||
"To use the editable installation, first build libxgboost with CMake. " | ||
"See https://xgboost.readthedocs.io/en/latest/build.html for detailed instructions." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what happens if you remove this? Is it possible to use an editable installation that also builds libxgboost? In that situation running pip install -e
would need to rebuild the library as well.
Not saying this feature should be implemented, just want to understand the current setup of the backend to figure out how best to work with this in the future for other RAPIDS libraries too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if you remove this?
Editable installation won't build libxgboost.so
, so import xgboost
will error out.
Is it possible to use an editable installation that also builds libxgboost?
I decided against it, as there is no reliable way for packager
module to detect changes in the C++ code. If we want to re-build libxgboost.so
upon changes made to C++, scikit-build-core
will probably be a better choice.
I ultimately decided against scikit-build-core
because:
- It doesn't yet support editable installation.
- I couldn't customize it enough to fit the current use cases of XGBoost.
- Much of what
scikit-learn-core
offers is not useful for XGBoost, since XGBoost does not use compiled extensions. XGBoost usesctypes
exclusively to access C API functions fromlibxgboost.so
.
The calculus will be different for RAPIDS libraries which uses compiled extensions extensively. IMHO, if I were to use compiled extensions a lot, I would use scikit-build-core
and give up on editable installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, scikit-build-core
just added experimental support for editable installs: https://github.com/scikit-build/scikit-build-core/releases/tag/v0.3.0
Not sure exactly what "experimental" means, to be fair, and to what extent that's acceptable for a project as widely-used as XGBoost. But just wanted to note it.
I am planning to pursue scikit-build-core
for LightGBM (microsoft/LightGBM#5759), to not have to have any lingering Python glue code that's exposed to the changes in Python packaging. That is still very much a work in progress though, so I'm definitely not recommending that you switch to that over hatchling
... just noting the link as something you may want to watch as we all figure out the setup.py
-free world together 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jameslamb Guess my knowledge was outdated :) In the future, I'll definitely consider scikit-learn-core
as a potential alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will be migrating RAPIDS to use scikit-build-core
at some point in the not-too-distant future. We're already using scikit-build
, so the transition is the most natural way to get us onto a PEP 517 build.
Support for editable installs is a must for RAPIDS, and is one of the reasons I haven't attempted the migration yet. That feature request is something they're aware of and are working on. My understanding is that the current experimental support only covers some possible cases around library changes but not all. I will definitely be keeping a close eye on this. I expect full support for editable installs is a precondition of a 1.0 release, and the project has been moving at a reasonable pace so it shouldn't be too long before that exists.
That said, at this stage scikit-build-core
is definitely less mature than hatchling and I can't fault your choice since you seem to have been able to get everything working fairly painlessly 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vyasr @hcho3 @trivialfis FYI I have a PR ready over in LightGBM for using scikit-build-core
: microsoft/LightGBM#5759. If you have time and interested, I'd appreciate any comments you have on it.
We don't care about editable installs for lightgbm
, and I'm willing to go contribute to scikit-build-core
(both code and discussions) as we discover issues in it, to help that tool mature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ping! I've started testing the migration for a couple RAPIDS libraries as well, see rapidsai/rmm#1287 and rapidsai/cudf#13531. Editable installs are already in significantly better shape than they were a couple of months ago FWIW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I learned a lot about scikit-build-core
in this process, @
me any time on those RAPIDS PRs if you want an opinion or another set of eyes.
Note that I don't have permissions to resolve threads on this PR so I tried to leave comments or otherwise clearly indicate where I thought things were resolved. |
Closes #8090
pyproject.toml
packager
directory. Build logic fromsetup.py
has been refactored and migrated into the new backend.pip wheel .
(build wheel),python -m build --sdist .
(source distribution)TODOs
packager/pep517.py
.packager
.Context
setup.py
for installing the Python package is deprecated.python setup.py install
will throw a warning:setup.py
is PEP 517 and PEP 621.pyproject.toml
. The TOML file is declarative and cannot contain arbitrary code unlikesetup.py
.pyproject.toml
:setuptools
,poetry
,hatch
etc. In addition, individual Python packages can specify a custom (bespoke) build backend by specifyingbackend-path
:A build backend implements two methods:
def build_wheel(wheel_directory, config_settings=None, metadata_directory=None)
def build_sdist(sdist_directory, config_settings=None)