Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support package_data-like of non-Python resource files in Python packages #22655

Closed
embray opened this issue Mar 20, 2017 · 22 comments
Closed

Support package_data-like of non-Python resource files in Python packages #22655

embray opened this issue Mar 20, 2017 · 22 comments

Comments

@embray
Copy link
Contributor

embray commented Mar 20, 2017

This enhances sage_setup.find.find_extra_files to better support non-Python resources found in Python package sources. Previously it could include extra files found under build/cythonized, but now it can also include arbitrary files (including support for fnmatch patterns) that should be installed alongside a Python package. For example,

find_extra_files('sage.foo', '.', special_filenames=['*.html'])

will find any .html files under the ./sage/foo/ package.

This will be useful for #21732 and possibly other tickets.

This also adds a package_data variable in module_list.py that can list all non-Python resources on a per-package basis (using the same format as the standard package_data option to setup.py. This includes handling of package data in a way that's compatible with sage_setup.clean.clean_install_dir.

CC: @mkoeppe

Component: build

Author: Erik Bray

Branch/Commit: u/embray/build/package-data @ 5d30652

Issue created by migration from https://trac.sagemath.org/ticket/22655

@embray embray added this to the sage-8.0 milestone Mar 20, 2017
@embray

This comment has been minimized.

@embray
Copy link
Contributor Author

embray commented Mar 20, 2017

comment:1

Hopefully this makes sense. Let me know if there's any more I can clarify. Note that, unlike my original attempt at this in #21732, I avoided moving module_list.py for now.

@jdemeyer
Copy link

comment:3

First of all, there is an easy conflict with #22662.

The function find_extra_files was obviously meant to find extra Cython files since it looks for .pxd and .pxi files.

It makes sense to generalise this, but then it should no longer handle .pxd and .pxi files by default: those should be passed as *.pxd and *.pxi arguments by sage_build_ext.copy_extra_files.

@jdemeyer
Copy link

comment:4

I think it's a bit silly to duplicate copy_extra_files. There are two reasons for this:

  1. The build_py command already contains lots of code to deal with "data files". It might be better to hook into this, for example by overriding build_py.get_data_files(). The only reason that I didn't do this in Use custom build_ext to compile Cython code #21600 is because build_ext is run after build_py.

  2. If you really want to copy the files manually, I don't see much reason to treat the Cython data files differently from the package_data data files. The current code makes a distinction between cmd_build_ext.cythonized_files and cmd_build_py.extra_files. There is no reason to do that.

See also #21682 comment:34

@jdemeyer
Copy link

comment:5

There is also an easy conflict with #22106.

@embray
Copy link
Contributor Author

embray commented Mar 28, 2017

comment:6

build_ext is run after build_py

Only if you run ./setup.py build. The two commands can be run independently as well and serve different purposes. In particular, it makes sense to make a distinction here because one case is for installing files that are found in the Python source tree, while the other is for installing files found under build/cythonized (which normally would actually be the Python source tree but we build this separate hierarchy--I never liked that, but whatever).

@jdemeyer
Copy link

comment:7

Replying to @embray:

build_ext is run after build_py

Only if you run ./setup.py build.

What do you mean? If I run pip install ... or ./setup.py install, it would still run build_py before build_ext, right? Or am I missing something?

@embray
Copy link
Contributor Author

embray commented Mar 29, 2017

comment:8

setup.py install calls setup.py build, yes. All of these commands can be run individually, and should work individually, in principle. It's just that some of them of them (namely "install" and "build") work primarily by running a number of simpler sub-commands in some order.

@jdemeyer
Copy link

comment:9

Replying to @embray:

setup.py install calls setup.py build, yes. All of these commands can be run individually, and should work individually, in principle.

For what value of "work"? Do you mean "work" as in: doesn't raise an exception? Or "work" as in: actually does something meaningful?

It makes little sense to run just build_ext without anything else. In the real world, build scripts have a natural order and dependencies. You cannot cherry-pick pieces of a build system and expect those pieces to be meaningful individually. Especially if you want something like cython/cython#1514 (which I think is a good idea) you cannot run build_ext individually.

Splitting the installation of data files in two, just because it adheres to some theoretical principle that nobody cares about, looks like a bad idea to me.

@jdemeyer
Copy link

comment:10

Together with #21682, I would like to move to the following build steps:

  1. Run build_cython to run Cython, which generates .c and .h files.

  2. Run build_py which "builds" Python files and which copies all package data, from Cython and non-Cython. Preferably, copying this package data can be done by using some existing machinery from build_py.

  3. Run build_ext as usual.

Erik, what's your opinion on this?

@embray
Copy link
Contributor Author

embray commented Mar 29, 2017

comment:11

It makes little sense to run just build_ext without anything else.

I do that almost every day on 'normal' projects. ./setup.py build_ext --inplace is the best thing to doing in-source development of a project that uses extension modules.

In any case, by overthinking it you're actually making it more complicated and less reliable. It works fine as is.

@jdemeyer
Copy link

comment:12

Replying to @embray:

It works fine as is.

...at the cost of extra complexity. I am a strong believer in the https://en.wikipedia.org/wiki/KISS_principle

@embray
Copy link
Contributor Author

embray commented Mar 29, 2017

comment:13

You don't need to link me to a Wikipedia article on "KISS".

In fact complexity is sometimes subjective. You're upset because you see what looks like a little bit of code duplication (actually, I don't see it as duplication; it's doing two somewhat different things that nonetheless look similar due to using the same API), but without it in fact you're relying on a fragile, poorly documented/specified state machine to do things in the right order or else everything breaks. That to me is complexity.

@embray
Copy link
Contributor Author

embray commented Mar 29, 2017

comment:14

(I do agree about rebasing on / integrating with #21682 except that's been stalled for months for no apparent reason)

@embray
Copy link
Contributor Author

embray commented Mar 29, 2017

comment:15

Alright, I rebased #21682, so I'll at least see about integrating this ticket with it properly. If there's any opportunity I can find to reduce code duplication I'll take it, but I do think in this case files generated from Cython should be handled separately from static files.

@fchapoton
Copy link
Contributor

comment:16

does not apply

@jdemeyer
Copy link

jdemeyer commented Apr 1, 2017

comment:17

Erik: I'm confused how you see the bigger picture with this ticket and #21682. In your ideal world, how would the build process work? In other words: what is your alternative to [comment:10]?

@jdemeyer
Copy link

comment:18

TODO: discuss next week if possible.

@embray
Copy link
Contributor Author

embray commented Apr 19, 2017

comment:19

Replying to @jdemeyer:

Together with #21682, I would like to move to the following build steps:

  1. Run build_cython to run Cython, which generates .c and .h files.

  2. Run build_py which "builds" Python files and which copies all package data, from Cython and non-Cython. Preferably, copying this package data can be done by using some existing machinery from build_py.

  3. Run build_ext as usual.

Erik, what's your opinion on this?

I think this makes sense, actually. The only thing is that build_cython needs to be responsible for providing a list of resource files it generates (.h files, etc.), but we could free it from responsibility for actually copying them. I don't think it makes a difference either way really but I don't care much at this point either.

@mkoeppe
Copy link
Member

mkoeppe commented Nov 17, 2019

comment:20

Let's get this ticket done to support #28752

@mkoeppe mkoeppe modified the milestones: sage-8.0, sage-9.0 Nov 17, 2019
@embray
Copy link
Contributor Author

embray commented Dec 13, 2019

comment:21

Yeah, I need to revisit this. I don't remember why I was so protective of my original approach, except that I do recall there being a good reason. I just don't know what it was anymore. I know one reason had to do with ensuring that _find_stale_files could work properly, but I don't know for sure that there isn't a simpler way...

@embray
Copy link
Contributor Author

embray commented Jan 6, 2020

comment:22

Ticket retargeted after milestone closed

@embray embray modified the milestones: sage-9.0, sage-9.1 Jan 6, 2020
@mkoeppe mkoeppe removed this from the sage-9.1 milestone Apr 9, 2020
@mkoeppe mkoeppe closed this as not planned Won't fix, can't repro, duplicate, stale Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants