Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organising the conda communities and establishing best practices. #299

Closed
bgruening opened this issue Apr 9, 2016 · 79 comments
Closed

Organising the conda communities and establishing best practices. #299

bgruening opened this issue Apr 9, 2016 · 79 comments

Comments

@bgruening
Copy link
Contributor

We all love conda and there are many communities that build awesome packages that are easy to use. I would like to see more exchange between these communities to finally share more build-scripts, to develop one best-practice guide and finally to have channels that can be used together without breaking recipes - a list of trusted channels with similar guidelines.

For example the bioconda community - specialised on bioinformatic software. They have some very nice guides how to develop packages, they are reviewing and bulk-patches recipes if there are new features in conda to make the overall experience even better.
ping @johanneskoester, @daler and @chapmanb from BioConda fame

Omnia has a lot of cheminformatic software and a nice build-box based on phusion/holy-build-box-64 + CUDA and AMD APP SDK.
ping @kyleabeauchamp, @jchodera

With conda-forge there is now a new one and it would be great to get all interested people together to join forces here and don't replicate our recipes or copy them from one channel to the other just to make them compatible.

Another point is that we probably want to move recipes to default at some point and deliver our work back to Continuum - so that we can benefit from each other.

I can imagine that we all form a group of trusted communities and channels and activate them by default in our unified build-box - or we have one giant community channel. All this I would like to discuss with everyone that is interested and come up with a plan how to make this happen :)

What do you all think about this?
As a next step I would like to create a doodle to find a meeting data where at least one representative from all communities can participate.

Many thanks to Continuum Analytics for there continues support and the awesome development behind scientific python and this package manager.
ping @jakirkham @msarahan

@bgruening
Copy link
Contributor Author

@kyleabeauchamp, @jchodera, @jakirkham @msarahan, @johanneskoester, @daler, @chapmanb, @jxtx, @jmchilton: please feel free to ping others and invite them :)

bgruening referenced this issue in conda-archive/conda-recipes Apr 9, 2016
@jchodera
Copy link
Contributor

jchodera commented Apr 9, 2016

Definitely interested in learning more! For now, pinging @rmcgibbo, @mpharrigan, @cxhernandez, @marscher, @franknoe, @pgrinaway, @bas-rustenburg.

@ocefpaf
Copy link
Member

ocefpaf commented Apr 9, 2016

@bgruening thanks for the message! In fact we just discussed that yesterday!!

Conda-forge was born from two communities similar to bioconda and omnia (the SciTools and IOOS channels) with the goal to reduce redundancy and join forces to produce high quality recipes and binaries. I would love to see more communities join us here. We are not the dark side but we do have cookies 😉 (Well... a cookie cutter... sorry for the false advertisement.)

I am trying to put a blog post online next week with more info. We are also planning on public (Google?) hangouts so we can have some online face-time and QnA sessions.

Meanwhile feel free to ask anything here, or in new issues, if the you have a very specific question.

Here is the gist of conda-forge:

  • we are a community driven conda packing loving geeks;
  • the entry step is this repository (staged-recipes). Contributors must add their recipes here via PRs;
  • once the PR is reviewed and merged a feedstock (an individual repo) is created for that recipe;
  • each feedstock has a team of maintainers. The maintainers are specified in the recipe via the extra/maintainers field;
  • the feedstock creation and the GitHub happens automagically;
  • we have a cookie cutter/do-it-all tool called conda-smithy. We use that to lint the recipes, update the feedstocks, and some convenience tools to work with the many repos model;
  • the maintainers "job" is mostly to update the software version, and merge an eventual auto maintenance PR like CI config updates.

There are many details I am leaving out and much more to talk about, but I will stop here for now.

The number one question we get is: why multiple repositories instead of one with all the recipes? We had (and still have) many discussions like this. However, all I have to say is: We tried the single repo model and now we are trying the multiple repos model. So far, the multiple repos has scaled much better, and none of the major worries we had became true.

@jchodera
Copy link
Contributor

jchodera commented Apr 9, 2016

This sounds great. @rmcgibbo is much more qualified to comment than I am here---he pioneered most of the omnia conda framework---but we ended up converging on our own build system (modeled loosely on conda/conda-recipes simply because we weren't aware of any other way to tackle this.

Where should we look for all the gory technical details about the build systems and automation? This was the hardest part for us, since we needed (1) broad platform support (hence the use of a phusion/holy-build-box-64 build system for linux, (2) CUDA and OpenCL support (via the AMD APP SDK), and (3) automated builds in reproducible environments for win, linux, and osx. We're also trying to keep old versions of packages live for scientific reproducibility---we frequently publish code with our papers and provide environment.yml files to ensure reproducibility with identical versions. Our approach started with actual local hardware and evolved to use cloud services (currently travis-ci and AppVeyor.

I'd love to understand more about how the conda-forge build system differs from what we currently use in omnia's build system.

@jakirkham
Copy link
Member

We are not the dark side but we do have cookies 😉

Which ones? For humans or browsers? 😆 Ok, it was terrible, but I had no self-control.


Yes, welcome all. 😄

Please feel free to peruse what is going on at conda-forge and ask questions. The best place to get acquainted with or propose general discussion topics is probably the website repo (in particular the issue tracker). There are many issues there that are likely of interest and welcome to healthy discussion of thoughts and personal experiences. Also, there may be a few closed issues there worth reading up on just to get a little bit of history (we are still quiet young 😉).

If you would like, feel free to submit a simple recipe or a few to get a feel for how everything works here. Also, feel free to check out our gitter channel for any generic questions or you may have.

Once everyone has had a chance to get a feel for how everything works and what seems personally relevant, we can figure out meeting discussion topics in some place YTBD.

Again welcome.

@jakirkham
Copy link
Member

Welcome @jchodera.

Where should we look for all the gory technical details about the build systems and automation?

This varies depending on the question. Let's try and direct you based on the points raised.

(1) broad platform support (hence the use of a phusion/holy-build-box-64 build system for linux

This issue has basically moved in the direction of various proposals of how to move the Linux build system forward. Though there is current strategy in place, as well.

(2) CUDA and OpenCL support (via the AMD APP SDK)...

This is under active discussion. The reason being this is tied to several issues including build system constraints, how features work, how and what of these libraries get distributed, etc.. See this issue. There is a proposed example there of how we might get this to work. However, we haven't settled on something yet.

(3) automated builds in reproducible environments for win, linux, and osx.

This is all over the map. 😄 In general, we use AppVeyor (Windows), Travis CI (Mac), and Circle CI (Dockerized Linux)

If you just want to read code, we can point you there. Proper documentation isn't quite there yet. Also, there isn't one singular issue for this, but it is discussed at various points in various issues. What sort of things would you like to know?

@daler
Copy link
Contributor

daler commented Apr 9, 2016

Hi all, checking in from bioconda. I've been poking around the conda-forge code and can't pin down where the magic is happening. Could you point to some code or to a description of what's happening to aggregate the one-recipe-per-repos?

To further the discussion, here's a description of the bioconda build system and where you can find the code.

  • Contributor submits PR
  • .travis.yaml calls scripts/travis-setup.sh on OSX and Linux, which starts a docker container if linux or does the OSX setup otherwise
  • scripts/build-packages.py is run. This does most of the work, specifically:
    • Up-to-date recipes are skipped. This lets us support 800+ recipes with very little overhead
    • Any recipes that need to be built are toposorted and added to a local file:// channel after being built. This ensures that a single PR can contain a batch of interdependent recipes and everything gets built correctly.
    • Recipe builds are treated as nosetests and therefore get the nice infrastructure that comes along with that for free (hiding stdout when not needed; keeping track of how many failures, etc)
    • If we're on a PR branch, don't upload to anaconda

The workflow is just like most anything else on github: submit a PR and wait for it to be tested. Once it passes, someone on the team merges it into master. Upon merging, travis-ci then runs again but on the master branch and this time upon completing, the built packages are uploaded to anaconda.

Aside from differences in the moving parts of the build systems, it sounds like we're all dealing with similar issues with respect to CUDA and gcc, etc. Would be nice to work out some best-practices that we could all use.

@jakirkham
Copy link
Member

Welcome @daler.

Could you point to some code or to a description of what's happening to aggregate the one-recipe-per-repos?

Sorry I'm not following this question. Could you please clarify what you are meaning by aggregate? It is a little unclear and I am a bit worried that there may be some misunderstanding of what is going on here. I'll try to clarify the big picture below.

To further the discussion, here's a description of the bioconda build system and where you can find the code....

Yes, SciTools and IOOS behave in a similar manner. However, those recipes along with many from conda-recipes are being ported over here as people from those groups seem to like this model.

Just to clarify, the model for building is very different here than the many recipes in a single repo. The reasons are varied, but I think the biggest difference is it allows people to take ownership of recipes/packages that are important to them and the tools (CIs) used to test, build, and deploy. This includes making bug fixes, releases, feature support, etc. Similarly it allows relevant discussion to break along those lines. In practice, this appears to be a huge asset. However, there are plenty of other reasons for one to consider this model.

How this works:

  1. Propose recipe(s) in a PR with a list of maintainers to staged-recipes (here 😉).
  2. Iterate on the recipe with CIs and reviewers (at least one of them is automated 😄).
  3. PR is merged and generates repo(s) each with a single recipe and all the needed build tools.
  4. Automatically, recipe maintainers are given commit privileges and control of relevant build tools.
  5. Automatically, the first build is made and released (assuming all dependencies are available).
  6. Do what is needed to maintain your recipe.
  7. Periodically merge CI tool maintenance updates.

While understanding this infrastructure may at first seem daunting, it is actually not so bad and is not really necessary. However, if you are curious, we are more than happy to explain the details.

Maybe if you could please rephrase your question in terms of these steps, we can do a better job at answering your questions and providing you places to look for more information.

Aside from differences in the moving parts of the build systems, it sounds like we're all dealing with similar issues with respect to CUDA and gcc, etc. Would be nice to work out some best-practices that we could all use.

Absolutely, we would be happy to point you to relevant issues where these are being discussed. Just please let me know which of these you would like to know more about.

@msarahan
Copy link
Member

msarahan commented Apr 9, 2016

@daler, aggregation is done at the https://github.com/conda-forge/feedstocks/tree/master/feedstocks repo. This is created with conda-smithy, particularly this module: https://github.com/conda-forge/conda-smithy/blob/master/conda_smithy/feedstocks.py

Continuum is very interested in this particular aspect (I am Continuum's representative here, though others are also involved in contributing recipes and discussing build tools). The one-repo-per-recipe model is necessary, I think, for two reasons:

- keep the load on the CI services small, and avoid their log size and build time limits
- divide responsibilities and authority for each recipe with much finer granularity

The latter is the bigger issue here, since you all have had reasonable success with CI.

Continuum has started a community channel (https://anaconda.org/pycommunity), with the long-term plan to have that as a package aggregation center. In my mind, the most important facet of this effort is to unite the recipes and have a single canonical source for each recipe. I don't care whether it's on some project's page (i.e. matplotlib), or on conda-forge, or whatever - so long as one place is the official source, and finding that source and contributing to it is straightforward. Conda forge is a great place to host recipes because it provides the CI of those recipes, and I like the distributed maintainer model, but I also think that hosting recipes directly at projects, and having conda-forge build from indirectly-hosted sources would be the ideal - that way the recipe would be holistically managed by the package originators.

For the pycommunity channel, we'll mirror or link packages from other channels. In the case of multiple package sources, we haven't quite figured out how to prioritize them (activity level? origin of package?) The hope is that rather than many organizations having to say "add our channel!" we'd instead have just one, and that one may be enabled by default for some "community edition" of miniconda/anaconda - or otherwise could be enabled with conda install pycommunity

@daler
Copy link
Contributor

daler commented Apr 10, 2016

@jakirkham and @msarahan thanks for your pointers. One missing piece for me was that submitting a PR to staged-recipes triggers the CI (only travis, right?) to call .CI/create_feedstocks, which sets up the infrastructure, tokens etc via conda-smithy and transforms the repo into something similar to what's in the feedstocks repo of submodules. Is that correct?

@msarahan -- Wholeheartedly agree that a single canonical source for each recipe is critical, and that finding that source and contributing needs to be straightforward. conda-forge/conda-smithy and pycommunity look like great tools to make that happen.

@jakirkham
Copy link
Member

@jakirkham and @msarahan thanks for your pointers.

Glad to help, @daler. Hope it wasn't too much. Just wanted to make sure we had common context for our discussion. 😄

One missing piece for me was that submitting a PR to staged-recipes triggers the CI (only travis, right?)...

When a PR is submitted all CIs (Travis/Mac, Circle CI/Linux, AppVeyor/Windows) are run and used to attempt to build the recipe, but do not release it.

...to call .CI/create_feedstocks which sets up the infrastructure, tokens etc via conda-smithy and transforms the repo into something similar to what's in the feedstocks repo of submodules. Is that correct?

Once the PR is merged, a Linux job in the Travis CI build matrix does the setup for the feedstock. It goes something like this for each recipe unless otherwise specified (steps 7, 8, and 9).

  1. Uses GitPython to create a new git repo on the VM with a few things committed.
    1. Cleaned up recipe
    2. Readme
    3. License
    4. conda-forge.yml
    5. CI files and CI support files
    6. .gitignore
  2. Create an empty repo on GitHub (via PyGithub).
  3. Push to the GitHub feedstock repo.
  4. Configure all of the CIs for the repo on GitHub.
  5. Commit info to CI configuration files for uploading binaries to Anaconda.org.
  6. Push to the GitHub feedstock repo.
  7. Delete all the recipes from the staged-recipes and commit.
  8. Push to the GitHub staged-recipe repo.
  9. Trigger a global feedstock update.

As you have mentioned, this all basically happens through conda-smithy. However, there is some code that lives here for that purpose too. Take a look at this log for configparser and entrypoints to get a better idea.

After generating a feedstock, a global feedstock update is run. It is pretty simple. It updates the feedstocks with the latest commit of each feedstock on master at conda-forge. It also updates the listing. However, changes may not be reflected in the listing immediately even if the changes have been made to the HTML source code due to how GitHub caches GitHub Pages.

@daler
Copy link
Contributor

daler commented Apr 10, 2016

Perfect, these were just the kinds of details I was looking for. Thanks. Hopefully it can help get others up to speed as they join the discussion as well.

@johanneskoester
Copy link
Contributor

Hi guys,
thanks for initiating this. It is very interesting to exchange ideas of how to build.
I have two questions:

  1. Have you every considered using the anaconda build service? I recently had a look at it, and it seems to me centered on packages instead of repositories/organizations, which is kind of unfortunate, because it needs to be set up for each package, right?
  2. With your conda-forge model, how do you deal with dependencies between recipes?

@msarahan
Copy link
Member

Have you every considered using the anaconda build service? I recently had a look at it, and it seems to me centered on packages instead of repositories/organizations, which is kind of unfortunate, because it needs to be set up for each package, right?

Yes, especially for Windows builds. Mapping conda-forge's model to Anaconda.org should be OK - the organization would be conda-forge, and each package would be a different build. Maybe I'm missing how this is different from the other CI services? Anyway, the hangup has been that anaconda.org has some kinks that need to be worked out.

With your conda-forge model, how do you deal with dependencies between recipes?

ATM, I think the answer is "we don't." There has been discussion about coming up with networkx-driven guidance of what recipes to work on next, but that has been for human consumption more than automated buildout of dependency trees. Before getting involved in conda-forge, Continuum developed a build script that also uses networkx, and builds out these trees. That code assumes a single folder of packages, which can be created from conda-forge using conda-smithy. The dependency building code is part of ProtoCI: https://github.com/ContinuumIO/ProtoCI/blob/master/protoci/build2.py

@johanneskoester
Copy link
Contributor

Thanks for the clarification.
My point is the following: if the anaconda build service could be setup per repository and not per package, CI job limits are no reason any more to have separate repositories per recipe, right?

@msarahan
Copy link
Member

I think separate repos per recipe are still a good thing, because it gives you complete control over who has permission to accept changes to a recipe. I don't know how we'd do that with many recipes under one umbrella.

@jakirkham
Copy link
Member

Before getting involved in conda-forge, Continuum developed a build script that also uses networkx, and builds out these trees. That code assumes a single folder of packages, which can be created from conda-forge using conda-smithy. The dependency building code is part of ProtoCI: https://github.com/ContinuumIO/ProtoCI/blob/master/protoci/build2.py

Would this work on the feedstocks repo possibly with some tweaks? This might be a good way to get things going and it would also avoid having several scripts created here that kind of do something like this. Thoughts?

@msarahan
Copy link
Member

Sure, I think so. It would need to be adapted to look into the nested recipes folder, but I think otherwise, it would work fine. It may also have trouble with jinja vs. static version numbers - but again, that's tractable.

@johanneskoester
Copy link
Contributor

@msarahan I agree, this is in general a nice advantage. I asked, because the situation is different for bioconda. There, we have a rather controlled collaborative community, and it is much more convenient to have all recipes in one repository (e.g. for toposorting builds).

@msarahan
Copy link
Member

Yeah, the one thing we don't have figured out well yet is how to edit multiple recipes at once. For aggregating them and building them as a set, I think conda-smithy + ProtoCI abstract away the difficulties with one repo per recipe.

@johanneskoester
Copy link
Contributor

But if you build them as a set, you have the problem with job limits in the CI again, haven't you?

@jakirkham
Copy link
Member

Yeah, I figure the nested directory structure needs to be addressed. Otherwise adding jinja template handing is probably valuable no matter where it is used, no?

@msarahan
Copy link
Member

adding jinja template handing is probably valuable no matter where it is used, no?

Absolutely. In case you missed it, @pelson has a nice snippet at conda-forge/shapely-feedstock#5 (comment)

@jakirkham
Copy link
Member

But if you build them as a set, you have the problem with job limits in the CI again, haven't you?

Well, one could consider some sort of debouncing to handle this. Namely even though one has made the change together and one is submitting them all ultimately, we manage submissions/builds somehow so that they are staggered. This will likely require some thought, but it is useful for some workflows with the recipes.

@msarahan
Copy link
Member

But if you build them as a set, you have the problem with job limits in the CI again, haven't you?

With anaconda.org, we don't have artificial limits. There are still strange practical limits - like logs that get too large end up making web servers time out. These are tractable problems.

@jakirkham
Copy link
Member

Interesting, thanks for the link. I'll take a closer look.

@johanneskoester
Copy link
Contributor

@msarahan, I know, you don't have these limits, but my understanding was that anaconda.org cannot out of the box build recipes as a set, right? You have to register an individual trigger for each of them? And then, their order of execution is no longer determined, and they can't depend on each other. Or am I missing something here?

@msarahan
Copy link
Member

@johanneskoester there would need to be some intermediate representation as a collection of recipes. Then that ProtoCI tool would then be able to build things that changed. It is written to build packages based on which packages are affected by a git commit. Here, obviously only one recipe could trigger, rather than many changing at once. That does not affect its ability to build requirements dependencies, though - and they'll be built in topological order.

@Lnaden
Copy link
Contributor

Lnaden commented Apr 25, 2017

Hello! I am wanting to let you all know that the omnia community has decided to start our migration over to conda-forge and I was hoping to revive the discussion or at least get help on how we can handle some packages which would require things beyond the current scope of the linux-anvil

Right now our plan is to handle the migration in stages, with the end goal of fully moving off the omnia conda channel.

  1. switch from our Docker file running from the phusion/holy-build-box to a modified version of conda-forge's linux anvil.
  2. Have our individual package maintainers start moving their packages to conda-forge such that they no longer depend on the omnia channel.
  3. Move packages which require additional software not currently available through conda(-forge) over
  4. Have all of omnia's packages on conda-forge.

With regards to the 3rd item on the list, I was hoping to get a discussion as to how to handle the additional tools some of our packages would need into conda-forge.

Here are the capabilities as I see them we would need and would like to discuss:

  1. Ability to install extra yum packages
    • I think this already exists through the yum_requirements.txt file, but I can find very little documentation on how this should work
    • It would be nice to get packages off the EPEL as well
  2. Installing TeX packages, but that might already be possible through the texlive-core package
  3. Installing GPU libraries to build against such as nVidia's CUDA and AMD's OpenCL SDK

I would also like to discuss the possibility of supporting extended Docker images which is how we are currently handling step 1 of our migration process, although this would have to be highly regulated to ensure compatibility across conda-forge.

There is alot to digest here, so I'm more than happy to split off parts of this to other issues as needed.

Pinging @jchodera

@bgruening
Copy link
Contributor Author

@Lnaden this will made my day! Awesome!
Regarding 3) do you have a list of packages that is needed?

A few of the omnia packages are already in BioConda which is using the same Docker Image as conda-forge, so it might be easier to port the BioConda package or use BioConda as well.

@ocefpaf
Copy link
Member

ocefpaf commented Apr 25, 2017

@Lnaden that is really good news. We are going to hold meeting tomorrow see

https://gitter.im/conda-forge/conda-forge.github.io?at=58fe073ccfec9192726d5141

and

https://conda-forge.hackpad.com/conda-forge-meetings-2YkV96cvxPG

for the details. If you are available at that time try to participate.

@Lnaden
Copy link
Contributor

Lnaden commented Apr 25, 2017

@bgruening It may take a while to migrate all the packages since we will be relying on individual package developers to move their own packages, but the first step will be to force them to add conda-forge as a channel. ;) I personally will also be very happy to not have to maintain our own conda-build-all script anymore.

To see everything we are doing differently, you can see the extended linux-anvil we are preparing to use not merged in yet. Its a bit crude since I know several packages are already installed but here is the list of things we need as of now:

yum:
perl dkms libvdpau git wget libXext libSM libXrender groff dkms libvdpau

of which I know libXext, libSM, libXrender, and groff are already installed based on our Docker build log. dkms and libvdpau are for the CUDA toolkits

texlive
The core + 21 extra packages, not going to list them here. Several are to ensure OpenMM can build docs against Sphinx 1.5.1 which stopped shipping a number of their own latex packages

GPU Related

  • AMD SDK 3.0
  • CUDA Toolkit 8.0

Not every package however needs all of those additions. The big one is OpenMM (openmm.org) which requires the GPU files, lots of the TeX, and some of the yum packages. Because we have a number of tools which rely on OpenMM though, we won't be able to move a number of the packages until we can move it first.

The packages which do not depend on OpenMM should be much easier to move over, and some packages such as mdtraj moved quite a while go.

@Lnaden
Copy link
Contributor

Lnaden commented Apr 25, 2017

@ocefpaf I should be able to attend the meeting. I also want to ping @peastman @mpharrigan and @rmcgibbo who may also be interested in being on that meeting if they can (5-6 PM UTC)

@jchodera
Copy link
Contributor

I suspect we may be able to avoid requiring texlive if we can get the texlive-core conda-forge package to work, but we had some issues getting tlmgr to correctly install the needed tex packages.

@bgruening
Copy link
Contributor Author

@Lnaden any reason you need those libraries inside the container, from a short look it seems that many of your libs already package, like the X.org stack, wget, perl - so no need to put this into the container.

@Lnaden
Copy link
Contributor

Lnaden commented Apr 26, 2017

@bgruening The split Docker image is a port from the CentOS 5-based Docker file here and here and I have not refined it yet.

I fully acknowledge there are packages which are already part of the base conda-forge linux-anvil that could be removed, but there are other packages such as dkms and libvdpau which we would still need. I'm also not sure about how much of the CUDA downloads we actually need, what we have now may be overkill, but we would need at least some of them, and those are not available through yum.

@jakirkham
Copy link
Member

I suspect we may be able to avoid requiring texlive if we can get the texlive-core conda-forge package to work, but we had some issues getting tlmgr to correctly install the needed tex packages.

I tried some time ago to get this to install tlmgr. FWIR it doesn't get installed with the way we are installing TexLive. It seems most distros prefer not to include tlmgr and instead package everything themselves. TBH this sort of makes sense as it is possible to end up with broken things when two package managers get involved. For instance, suppose tlmgr downloaded some packages that require some image libraries. How does it resolve this? Does it use system ones from some standard location? What if it can't find them? Does it try to install the libraries itself? How do we teach it where conda's libraries are? There may be sensible solutions to this. Then again, we may end up hacking tlmgr and every package it provides to the point where we are better off packaging these ourselves as it will be less work. Something to consider.

@jakirkham
Copy link
Member

Ability to install extra yum packages

Yeah, just add a yum_requirements.txt file in the recipe directory. Here's an example. Though if anything else needs a package that uses a yum_requirements.txt file, it will need to include all the content from that package's yum_requirements.txt recipe and anything else it might need. Beyond that, everything is tooled to correctly handle this case.

It would be nice to get packages off the EPEL as well

We should talk more about this in a separate issue. Maybe on the webpage repo. It would be good to hear what packages are required from this source. Generally we have been moving to use yum less, not more. So we would need to understand why these can't otherwise be packaged.

@Lnaden
Copy link
Contributor

Lnaden commented Apr 26, 2017

most distros prefer not to include tlmgr and instead package everything themselves.

I would love if all distros actually packaged their own tex packages themselves, and I would very much prefer not to rely on a conda-based TeX install. From my very brief testing of the texlive-core package on conda-forge, the tlmgr it installs fails horribly if you have a global tex installed on your computer since the libraries are mis-matched. One problem we had recently was upgrading Sphinx from 1.3.1 to 1.5+ where they actually stopped shipping several tex packages so we had to include them.

The only thing I think the packages in omnia use TeX for is building the docs to ship with the build. Maybe an option is to make TeXLive available on the image so packages can add tlmgr commands to their build.sh scripts and ship their compiled docs. If users want to build the docs themselves, they will have to rely on their own local TexLive. Just a thought.

Yeah, just add a yum_requirements.txt file in the recipe directory.

I have seen several packages do this. My larger concern is the lack of documentation on this feature, the best I could find was mention of it on conda-smithy's page of the docs.

We should talk more about this in a separate issue. Maybe on the webpage repo.

Happy to. I know at least DKMS and libvdpau from EPEL is needed to make the CUDA toolkits work correctly, so if we can solve the GPU libraries instead, that may remove the need to get EPEL and allow us to shift away from heavy yum usage.

@jakirkham
Copy link
Member

My larger concern is the lack of documentation on this feature...

Yeah, documentation is a weak point for conda-forge still. That said, @pelson and I wrote the yum_requirements.txt functionality and there are many important recipes (e.g. qt) that depend on it. So if you are worried about it doing something surprising or disappearing, I don't think that should be a concern. That said, maybe worth adding an issue at the webpage repo to document its behavior.

@jakirkham
Copy link
Member

Installing GPU libraries to build against such as nVidia's CUDA and AMD's OpenCL SDK

Basically this has been a problem of licensing. The same issue occurs with MKL for instance. It's been awhile since I've looked into it.

I know that Continuum was able to get permission to build and distribute CUDA and cuDNN libraries. Though they are using CUDA 7.5 not 8 currently. If there is only one or two low-level packages like that, we could request they get build and added to defaults. That said, we would still need the recipes for them.

Alternatively we could try emailing NVIDIA directly and asking how to use their toolchain in CIs and how to distribute the CUDA libraries. Considering the size and audience of conda-forge, I think NVIDIA would be interested in a mutually beneficial solution. Would you have interest in writing such a letter? We could put this in a Google Doc to get feedback from others.

Haven't looked into AMD's OpenCL SDK yet. Though am getting the impression that sending an email their way wouldn't hurt either.

In any event, let's move this discussion over to issue ( conda-forge/conda-forge.github.io#63 ).

@jjhelmus
Copy link
Contributor

Continuum is distributing the CUDA and cuDNN libraries and headers in the cudatoolkit and cudnn packages. To build packages which compile against these libraries you need to install the CUDA command line tools (nvcc, etc) on the bare metal/VM/Docker container as well as have a host machine with the correct NVidia drivers installed. AFAIK, on CentOS the driver require DKMS which is only supported by installing EPEL packges. I do not believe these tools can be distributed, although NVidia does provide docker images with these tools installed but it is unclear to me if new images created from these can be re-distributed.

@jchodera
Copy link
Contributor

jchodera commented Apr 26, 2017

To answer @bgruening's question:

@Lnaden any reason you need those libraries inside the container, from a short look it seems that many of your libs already package, like the X.org stack, wget, perl - so no need to put this into the container.

When we set up our omnia-build-box docker image, we found those yum dependencies were required in order to install the CUDA toolkit and libraries, as @jjhelmus suggests. I don't think we'd be able to migrate most of those dependencies out to packages.

Alternatively we could try emailing NVIDIA directly and asking how to use their toolchain in CIs and how to distribute the CUDA libraries. Considering the size and audience of conda-forge, I think NVIDIA would be interested in a mutually beneficial solution. Would you have interest in writing such a letter? We could put this in a Google Doc to get feedback from others.

@jakirkham : The first thing I would do would be to bring in Mark Berger, the Senior Alliance Manager for Life/Materials Sciences at NVIDIA, to help connect us directly with someone at NVIDIA who can help negotiate any technical/licensing issues. He is incredibly supportive of building an ecosystem for scientific computing that can exploit NVIDIA hardware (that's his job!), and making CUDA-enabled packages more accessible could only further that goal. We can help initiate that contact by email on our side as needed---@Lnaden can help coordinate.

@jakirkham
Copy link
Member

To build packages which compile against these libraries you need to install the CUDA command line tools (nvcc, etc) on the bare metal/VM/Docker container as well as have a host machine with the correct NVidia drivers installed. AFAIK, on CentOS the driver require DKMS which is only supported by installing EPEL packges.

Thanks for the details @jjhelmus. Maybe we should come up with some scripts to automate this process on the CIs. If they just live in CI builds, that should minimize the chances of accidentally distributing these.

The first thing I would do would be to bring in Mark Berger...

That's fantastic @jchodera. Please do.

@jjhelmus
Copy link
Contributor

Do any of the CI providers have nodes with NVIDIA GPUs? I do not know if these would be strictly needed to compile GPU packages but they would be needed to properly test the packages.

@Lnaden
Copy link
Contributor

Lnaden commented Apr 26, 2017

That is one problem we have with OpenMM right now, our builds on Travis and Appveyor cannot test the GPU components correctly. Our current process involves building and uploading OpenMM on a dev label to the omnia conda channel, testing the packages locally on machines with GPUs, then moving them to the main tag when it checks out. Not the most automated process.

IIRC you can set up local Jenkins tests to handle the GPU, but I think that requires local physical boxes you can set up with private access, and clearly not a viable solution for conda-forge

@johanneskoester
Copy link
Contributor

johanneskoester commented May 1, 2017

At Bioconda, we are currently experimenting with Buildkite. Basically, Buildkite provides an interface and management framework for CI agents that are deployed to local machines. This has two major advantages

  1. Full control over the used hardware (including GPUs)
  2. No build time limits or bottlenecks. The build system can be easily scaled by connecting new machines.

With such an approach, security of the used systems becomes important of course. For example, they should always be in some kind of DMZ, and build jobs should be executed in a volatile, containerized or virtualized environment.

@jakirkham
Copy link
Member

How does Buildkite compare to say Concourse CI, @johanneskoester? Asking because @msarahan has been working on using Concourse CI to build conda packages.

ref: https://github.com/conda/conda-concourse-ci

@msarahan
Copy link
Member

msarahan commented Jun 26, 2017 via email

@johanneskoester
Copy link
Contributor

johanneskoester commented Jun 26, 2017

Indeed, looks similar. With buildkite, dynamic batch jobs are possible as well (via their agent tool). I don't know if concouse makes this easier though.

Note that we have stopped our experiments with builkite though. We mainly were interested because of the build time limits on Travis, but we found a way to circumvent that nicely within Travis CI. For large bulk updates, we have simply one fixed branch. When we update our global pinnings on that branch, all affected recipes are rebuilt. Whenever a built succeeds, the package is uploaded to anaconda. Since hundreds of recipes can be affected, the build time limits can be exceeded. However, since we upload immediately after success, no recipe has to be built twice. Instead, we can simply fix all failed recipes, and trigger a rebuild by pushing the fixes. Now, in the next iteration, new recipes will be built in addition to the fixed ones, and so on. By this, we can simply reduce the number of remaining recipes to zero with a couple of iterations on the bulk branch. This worked really well when recently updating Python to 3.6 and R to 3.3.2. And we have all recipes in a single repository, so it is easy to keep track of the dependency DAG in order to build in the correct order.

@jakirkham
Copy link
Member

Closing as I think this has served its purpose. Happy to continue relevant discussions in new issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests