-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split packages proposal #1338
Comments
My "ideas" are in conda/conda#793 (comment) I would prefer a file based package split, mainly because I think (but without hard data...) that's the only way which would scale over different build systems and prevent filename collisions (e.g. if all matplotlib install features install a base file, how to only add it to one package?) [Rest ist from the linked comment:] I would find it better if there is an additional way to add binary packages, which can take specific files and the rest is taken by the main package. Like:
This would build 4 packages: mypackage-tests, mypackage-docs, mypackage-pyqt and mypackage. Each package can be installed as a normal package... In this case, the three additional packages depend on the exact version of main package, so that updates to e.g mypackage-pyqt will also update the main package and keep them in sync. See also the debian dir for the matplotlib debian package, which works similar, only the above info is split across multiple files: https://anonscm.debian.org/cgit/python-modules/packages/matplotlib.git/tree/debian
|
Thanks @JanSchulz I have thought about your idea, and come up with this syntax that I hope captures most or all of what you want, along with some additions:
CC @jjhelmus - for wheels. |
In discussion with @mingwandroid, he would like to be able to take something like thrift, which has several language backends supported, and break out the backends within one recipe. I think this will not work well, because each backend may then be split further (for example, Python 2.7/3.4/3.5). I think that a constraint here is a one-to-one mapping between output entries in the recipe and outputs. Additionally, although the script entry may be used to build arbitrarily and deploy, I would discourage it, as it skirts most of conda-build's logic for setup and such, and ultimately may not work well. |
Do I understand that right that if I want to split a package, e.g. libpng into header and library, I would have to create two install scripts and would need to compile+install it twice plus different "deletes" in PRefix to get the right filesets? If so: how would that handle longrunning builds of big libs which would now need to be done twice? Or is the build script kept and only the install step would be done in the new scripts under outputs? I still think using something like the following is easier:
[This is inspired by the debian system of splitting a package] Why: as far as I know you can't really split a package in any other lang than python? At least makefile based systems are either split at configure time or via selectively "include" files form the installed directory tree. So if the build script is kept (=only one build) and the script sunder outputs is used only for installing stuff you are either running |
No, not really. The way it would work would be:
The actual building is done once, but the packaging can be split across more steps. Also, you don't need to create separate files for simple commands. For example, if there were several different I'm not opposed to a filelist, but I do think it is more trouble. I absolutely will not remove scripts in favor of a filelist, but would be OK with supporting both. |
I still don't get it: lets assume a package where the upstream installer installs two files ( ./configure --prefix=$PREFIX # probably no needed do to condas rewritings?
make
make install DESTDIR=$PREFIX After the "build" step, the PREFIX dir will have both files installed. Would they both end up in a package named after the source package name, under the "old" rules. Not sure if that is still the case, even if there is a What would the outputs scripts then do? Do each another install and remove files which should not belong to the current package? |
Here's a few examples we have come up with:
If file overlaps in these packages are a problem, we can perhaps work around it by making the subpackages be the "owners" of files, and compose the top-level packages as metapackages of the subpackages. We can hybridize, too - partially metapackage, partially real files. In this case, it may be advantageous to create all of the subpackages first, and then have the main package pick up any unhandled files, as you mention. |
I think the most important usecase is the "split header and lib" case and if that ends up the hardes to achieve case that's not good. E..g. for matplotlib and python scripts in general, I'm not sure if there is a way to do a In that case, the existence of an output key would indicate that only the package names below the for 1), if you want to build a source package, you would need to do a
Debian packages and RPM based packages are both based on creating packages based on filespecs: why throw that "experience" away? |
There are many use-cases of split packages, and no one type should be used to limit the design of this feature.
My issue with filelists is that they are limiting and require constant gardening (e.g. updates to a package mean that a new file extension gets added, or some *.dat files now belong in one package and some in another). Being able to rely on the build system's sub-targets (where present) seems to me to be a good thing. |
In https://groups.google.com/a/continuum.io/forum/#!topic/conda/qss8IlzxweI I proposed the idea of having multiple build environments for a single recipe - the build script is then responsible for putting appropriate files in the appropriate build environment (rather than listing files a-la RPM). I think my suggestion from 18 months ago still has some legs, though perhaps some of the metadata could be refined somewhat. |
Thanks for pointing that out @pelson. I think that's very compatible with what I've proposed above. Sorry I wasn't around to see it the first time! I'm going to start hacking a prototype out. My intent is to support both file lists and scripts, but both will be based on "clean build environments" which conda then treats as it does already - just two different ways to copy files there. Thanks everyone for your feedback. |
Right now I'm pondering how to handle file collisions and association of subpackages with parent packages. I think that we should disallow file collisions. What I've come up with as a way to avoid them is the following scheme:
|
Hi there, thank you for your contribution! This issue has been automatically locked because it has not had recent activity after being closed. Please open a new issue if needed. Thanks! |
@mingwandroid and I have been discussing split packages as an urgent prerequisite to enabling easier construction of build toolchains (perhaps with crosstools-ng, http://crosstool-ng.org/)
In coming up with ideas for implementation, we're looking at precedent set by Linux distributions.
What we have as initial design ideas for conda-build are:
CC @pelson @ocefpaf @JanSchulz - we wanted to involve you, given your involvement with some of these systems. Do you have opinions or battle scars to share? Naming is all negotiable here at this point - if something like the "outputs" field seems like a good idea with a bad name.
The text was updated successfully, but these errors were encountered: