Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix octopus binary "invalid instruction" (-march too new) #11822

Merged
merged 2 commits into from
Nov 17, 2018

Conversation

holtgrewe
Copy link
Contributor

  • I have read the guidelines for bioconda recipes.
  • This PR adds a new recipe.
  • AFAIK, this recipe is directly relevant to the biological sciences (otherwise, please submit to the more general purpose conda-forge channel).
  • This PR updates an existing recipe.
  • This PR does something else (explain below).

@chapmanb @kyleabeauchamp What do you think?

@chapmanb
Copy link
Member

chapmanb commented Nov 8, 2018

Manuel;
Thanks for looking at this. Would you be able to explain more about the motivation and consequences of this? Not being a C++ expert, it seems like this would be less portable by requiring intel sandy bridge chips on any machines downloading and using these binaries. I don't know what the defaults do without this, so could use a little handholding about the approach you're taking. Thanks again.

@holtgrewe
Copy link
Contributor Author

@chapmanb Ah, sorry for not giving much context here. We are hitting an issue here with that the binaries for octopus actually require a Haswell architecture (one or two generations after Sandybridge) and we get an "invalid instruction" error on our Sandybridge cluster nodes. At another point, I asked about policies of this but there really was none somehow.

I think #2354 contains some relevant discussion and links. Maybe @kyleabeauchamp can give some insights into this. I would be fine with "going back" further or even somehow shipping multiple architectures at the same time.

Right now, the package is broken for us and I could imagine that this is the case for people who have even older cluster nodes (e.g., high memory nodes tend to be kept alive longer than the average compute node as they are pricy).

@chapmanb
Copy link
Member

chapmanb commented Nov 8, 2018

Manuel -- thanks much for the context. I'd definitely be keen to make this as portable as possible. I don't know enough about this to comment usefully, but it would be helpful to have @dancooke in on the conversation. Dan, we're working on making the octopus binary more portable. Do you know the potential impact of compiling against older or multiple architectures on runtime? Any tips/tricks for us doing this in the best way?

@epruesse
Copy link
Member

Recipe's should not set -march or -mtune at all. They should actually take care to remove any such flags from upstream build systems. (I'm surprised the CB3 compilers don't flat out override those flags.)

The main argument here is that if someone needed to eek out those extra 3%, they could take the parts of Bioconda they need, adjust the conda_build_config.yaml, and rebuild into their own channel. If the recipe messes with those flags, that might not work as desired.

The someone might be us at some point. Or just you. Or perhaps a new project called Biogentoo. In any case, the default for Bioconda at this point needs to be "works everywhere".

just my $0.02

@holtgrewe
Copy link
Contributor Author

@epruesse I agree. But how do we achieve this in Bioconda for this package in a short timeframe?

@epruesse
Copy link
Member

@holtgrewe Just add a patch removing the -march=native bit from src/CMakeLists.txt and open an issue upstream, notifying them that their Release builds code that won't run on a CPU older than the build machine.

@dancooke
Copy link
Contributor

@chapmanb Thanks for looking into this. I can't say that I've done much benchmarking across different architectures. However, I did find that compiling on one of our centres cluster nodes that supports AVX-512 did speed up runtimes considerably - most noticeably for somatic calling - compared with nodes that only support AVX or AVX2. I could imagine this is due to better vectorisation in the maths heavy methods used for somatic calling.

@epruesse epruesse changed the title This attempts to make the binary more portable by only requiring Sand… Fix octopus binary "invalid instruction" (-march too new) Nov 12, 2018
@epruesse
Copy link
Member

@dancooke All 64 bit CPUs have SSE2, so we always get that. Depending on what's being done, SSE4.2 brings significant extra performance, while AVX and AVX2 aren't worth it in my experience. AVX-512 again brings a lot, but it's actually a collection of extensions of which CPUs only implement some.

Options:

  • follow the RAxML approach and compile a bunch of variants of the binary.
  • go above that and have a wrapper script selecting the right one.
  • go the vsearch route and do this inside of your binary with hand optimized SIMD instructions (although vsearch doesn't get this 100% right and breaks on a small subset of the Travis nodes - testing is really hard).
  • build the compute kernels in a separate DSO, build that multiple times for a range of -march targets and dlopen the right one at run time
  • create a channel, or multiple channels, of your own, in which you place copies of the packages built for specific architectures. You can then layer the appropriate channel above Bioconda to override packages with custom optimized ones.
    Bear in mind though, that this needs a separate miniconda folder for each CPU architecture.

@chapmanb
Copy link
Member

Thanks for all this helpful discussion. My take here is that we should merge Manuel's fix. The upstream approach provides some value on specific architectures if pre-compiling so will likely stay in place. I don't see anyone jumping to try and maintain multiple optimized versions in bioconda right now, so the fix here will give us better compatibility without an obvious runtime issue. What does everyone else think?

@holtgrewe
Copy link
Contributor Author

I think we need @epruesse's suggestion of patching the CMakeLists.txt. I can look into this tomorrow.

@epruesse
Copy link
Member

There you go. Let's see if that works.

@lh3
Copy link
Contributor

lh3 commented Nov 16, 2018

Intel developers suggest that the right strategy is to use the so-called CPU dispatch. The idea is to compile different versions of the same functionality at the compile time. During run time, the program dynamically chooses the fastest version based on the host CPU. GATK uses this approach.

ksw2_dispatch.c from minimap2 shows a working example. It prefers sse4.1 if present, or falls back to sse2 otherwise.

@epruesse
Copy link
Member

@lh3 Thanks for the links!

In the end, dynamically selecting code paths suited to the CPUs capabilities at run time is the way to go.

It can happen at whole-binary level (e.g. raxml-AVX - though it's missing a dispatcher), or library level (e.g. Intel MKL, where libmkl_core contains a dispatcher and libmkl_avx2 and friends are optimized), or function level (e.g. minimap2 or vsearch).

The higher levels are easier if you want to let the compiler optimize for features, the lower if you want to do it by hand.

The catch with hand optimized SIMD code, such as vsearch and minmap2 are using, is that it's not just hard to write, it's also hard to test without having a large selection of computers with different CPU generations on hand. (We have an issue here regarding a vsearch problem on some Travis instances...)

The catch with letting the compiler do the optimization is that the tree vectorizers are somewhat fickle with regard to the way your loops are written. So you have to monitor with diagnostic flags that your code is actually still vectorized. And, as happened to DADA2, you may find that users with older compilers suddenly loose a lot of speed (Bioconda builds with GCC 4.8 were orders of magnitude slower).

No free lunch anymore... :)

@lh3
Copy link
Contributor

lh3 commented Nov 16, 2018

When vectorized, some algorithms take a distinct form. Automatically vectorizing such algorithms is beyond the capability of compilers. Hand written intrinsics is the only option. Debugging CPU dispatch is actually easy. Because dispatching is dynamic, you can implement a command-line option to target a particular instruction set at run time. You only need one recent computer to test all older architectures.

CPU dispatch does pose a challenge to conda because it may require fairly recent CPU and compilers. Purely from an enduser point of view, the best solution is to provide a portable binary precompiled on a recent machine (GATK follows this route). The conda recipe only downloads the binary, not compiling from the source. This way, all users get the maximal performance on their machines. From a developer point of view, however, this can be very complicated to implement.

@dancooke
Copy link
Contributor

@lh3 This is an interesting approach, but I would guess that the number of routines where vectorisation results in a meaningful difference to overall runtime, and that the compiler can't automatically vectorise, is small for most programs. In Octopus, the only algorithm that uses hand-coded SIMD instructions is the pair hMM, which uses SSE2. In this routine, the size of the SIMD registers determines the band-size of the DP table, so using a different SSE instruction set actually changes the behaviour of the algorithm (by allowing bigger gaps). Of course, the SIMD code could be modified to keep functionality equivalent to the largest register variant, but then this penalises the smaller register variants by enforcing a larger band-size. The band-size should be defined by the program or user, not the system architecture, so we'd also need to write routines for each possible band-size and select the correct implementation based on that and the system architecture.

On the other hand, compilers are very good at vectorising other methods, such as inner products, that account for much of the computation time in Octopus (e.g. in expectation-maximisation). Writing custom dispatch-based SIMD code for these methods would likely result in little/no performance improvement, but a maintenance headache.

So ultimately, I think that the high-level approaches suggested by @epruesse are the way to go. The multiple-binary 'wrapper script' approach seems a good option. @epruesse, I presume this means the script selects the correct binary on installation, rather than at runtime?

Thanks everyone for all their help and suggestions on this issue.

@lh3
Copy link
Contributor

lh3 commented Nov 16, 2018

On paired-HMM, you are explicitly using SSE2. Keeping it that way requires no changes to the code. On inner products etc, you can create different versions of sdot() and choose the best version at runtime. Another option is to use a BLAS library and let the library deal with the rest. That is one extra dependency, though. Note that the point of CPU dispatch is to let users conveniently and consistently get the best performance out of our tools. CPU dispatch does make development more difficult. That is the price we developers pay for better user experiences.

If you go for the wrapper approach, the wrapper should choose the right version at runtime. In a cluster environment, we install a tool on one machine and run it on other machines that may have different specs. The launcher script has to choose the right version at runtime, or the tool may cause the invalid instruction error (if the install machine has a higher spec) or underperform (if the install machine has a lower spec).

By the way, back to this PR, I fully agree that we should remove "-march=native". CPU dispatch is more for future considerations.

@epruesse
Copy link
Member

epruesse commented Nov 16, 2018

I presume this means the script selects the correct binary on installation, rather than at runtime?

@dancooke You can't assume that the CPU features available at installation time are also available at run time. Small clusters in which new generations of compute nodes are added over time are probably pretty common. So I'd definitely detect the CPU at run time: the cost in storage and CPU cycles is negligible, whereas explaining to each new Bioconda user how CPUs are different is anything but.

@epruesse epruesse closed this Nov 16, 2018
@epruesse epruesse reopened this Nov 16, 2018
@epruesse
Copy link
Member

(Build failed because conda.anaconda.org had a 5xx error - let's retry that)

@epruesse
Copy link
Member

[off topic]

When vectorized, some algorithms take a distinct form. Automatically vectorizing such algorithms is beyond the capability of compilers. Hand written intrinsics is the only option.

@lh3 My experience is very limited, so not contesting your claim here. When I was working on ARB's parsimony implementation, I found it possible to write that "distinct form" in C, allowing the compiler to vectorize while avoiding intrinsics. The code ended up looking rather unusual, and required checks in the build system to prevent minor edits from breaking vectorization. The hope was that this would be more future proof than assembly style direct use of intrinsics.

Debugging CPU dispatch is actually easy. Because dispatching is dynamic, you can implement a command-line option to target a particular instruction set at run time. You only need one recent computer to test all older architectures.

While that allows you test each implementation, it doesn't allow you to test whether an implementation works on a CPU with a smaller instruction set. You could still have an AVX2 instruction in your Pentium4 implementation and wouldn't catch it (in reality it's going to be more subtle than that of course). I think that's what's causing the vsearch issue.

I wish there was some kind of test framework that allowed checking this. Never found anything though.

CPU dispatch does pose a challenge to conda because it may require fairly recent CPU and compilers. Purely from an enduser point of view, the best solution is to provide a portable binary precompiled on a recent machine (GATK follows this route). The conda recipe only downloads the binary, not compiling from the source. This way, all users get the maximal performance on their machines. From a developer point of view, however, this can be very complicated to implement.

Why does the compile host matter? You may not be able to run tests, but "cross compiling" for more capable CPUs should only be a function of the compiler.

BTW: There is a WG21 proposal for extending C++ with standard templates abstracting from SIMD intrinsics (P0214R9, partial implementations in dimsum, vc, pik/simd).

@lh3
Copy link
Contributor

lh3 commented Nov 16, 2018

We are talking about slightly different things, but anyway I agree with you. If you know what assembly or intrinsics-based code should look like, you can write unusual but pure C code and let the compiler vectorize for you. I know some Intel developers are doing this. However, their implementation at the time only worked with Intel compiler.

On debugging, you can disable advanced instruction sets at the compile time, such that calling those instructions leads to a compiling error.

Why does the compile host matter? You may not be able to run tests, but "cross compiling" for more capable CPUs should only be a function of the compiler.

That is host dependent. Gcc on several of my machines are recent enough but they don't have advanced instruction sets built in.

@epruesse epruesse merged commit 4d91d8d into bioconda:master Nov 17, 2018
@epruesse
Copy link
Member

@bgruening Can you take care of the "broken" build?

bgruening pushed a commit that referenced this pull request Dec 9, 2018
* r-classdiscovery (#12105)

* r-oompabase

* r-classdiscovery

* Remove oompabase

* Fixed Dependency Biobase

* R lncpipereporter (#12102)

*  ADD lncPipeReporter

* update

* update

* seqkit 0.9.2 (#12110)

* Update fastq-scan to v0.3 (#12109)

* Add perl-clone-choose (#12108)

* Update perl-hash-merge to 0.300 (#11941)

* Update perl-hash-merge to 0.300

* Update perl-hash-merge to 0.300

* Update meta.yaml

* Update perl-file-slurper to 0.012 (#11910)

* Update perl-file-slurper to 0.012

* Adding dependencies

* Update perl-moo to 2.003004 (#11897)

* Update perl-moo to 2.003004

* Update meta.yaml

* Add deeptoolsintervals (#12114)

* Add deeptoolsintervals

* blah

* No nose in the mulled container

* Fix octopus binary "invalid instruction" (-march too new) (#11822)

* This attempts to make the binary more portable by only requiring Sandybridge.

* Fix march set to native

* Update perl-bio-asn1-entrezgene to 1.73 (#11914)

* Update perl-bio-asn1-entrezgene to 1.73

* Update perl-bio-asn1-entrezgene to 1.73

* Update meta.yaml

* Update perl-file-slurp to 9999.25 (#12127)

* Update manta to 1.5.0 (#12126)

* Update nanocomp to 1.0.0 (#12124)

* Update masurca to 3.2.9 (#12121)

* Update beagle-lib to 3.1.1 (#12119)

* Update r-spocc to 0.9.0 (#12118)

* Update kallisto to 0.45.0 (#12117)

* Update validators to 0.12.3 (#12116)

* Update illumina-interop to 1.1.7 (#12115)

* Update umi_tools to 0.5.5 (#12123)

* Update tiptoft to 1.0.0 (#12122)

* Update: TitanCNA with fix for hg38 plotting (#12130)

* blockclust (#12111)

* added eden 1.1 version used in blockclust

* Update meta.yaml

* eden checksum and test

* added zlib requirement

* add include and lib paths to build.sh

* add include and lib paths to build.sh

* modify makefile flags using sed

* modify makefile flags using sed

* use c++

* Update build.sh

* Update build.sh

* LDFLAGS not used in makefile

* remove v1.1

* add blockclust 1.1.0 recipe

* skip osx build

* added tarball url and checksum; removed conda_build_config.yaml

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Update chewbbaca to 2.0.16 (#12132)

* Update abeona to 0.40.0 (#12131)

* ping to eden 1.1 (#12134)

* clean swap file (#12137)

* Update corset to 1.07 (#12143)

* Update peddy to 0.4.3 (#12144)

* blockclust latest source and added cloudpickle as requirement (#12142)

* Update nanomath to 0.22.0 (#12149)

* Update pymvpa to 2.6.5 (#12152)

* Updated to version 1.4.0 (#12138)

* Updated to version 1.4.0

* Resetting build number to 0 and build->host

* Update transdecoder to 5.5.0 (#12172)

* Update gridss to 2.0.1 (#12190)

* Update tracer to 1.7.1 (#12189)

* Update flowcraft to 1.4.0 (#12179)

* Update seqbuster to 3.2 (#12182)

* Update perl-io-compress to 2.081 (#12201)

* Update twobitreader to 3.1.6 (#12205)

* Update twobitreader to 3.1.6

* Update meta.yaml

* Update gtfparse to 1.2.0 (#12186)

* Update gtfparse to 1.2.0

* remove compiler

* Update htseq to 0.11.0 (#12177)

* Update htseq to 0.11.0

* add compiler

* Update htseq to 0.11.0 (#12239) [testing co-author]

* Update htseq to 0.11.0

* add compiler

Co-authored-by: Bjoern Gruening <bjoern.gruening@gmail.com>

* alfred v0.1.16 (#12240)

* alfred v0.1.16

* makefile patch

* makefile patch

* Update scrm to 1.7.3 (#12150)

* Update scrm to 1.7.3

* Update scrm to 1.7.3

* Mob suite update 1.4.8 --> 1.4.9 (#12244)

We fixed a small bug related to repetitive elements database reconstruction on each run. Should only run once during the first mob_typer run. Should be good now for parallel runs.

* PeptideShaker updated to v1.16.35 & SearchGUI updated to 3.3.9 (#12241)

* PeptideShaker updated to v1.16.32
SearchGUI updated to v3.3.6

* PeptideShaker updated to v1.16.35

* SearchGUI updated to v3.3.9

* First attempt at a recipe for ncrf (#12101)

* First attempt at a recipe for ncrf

* changed from copies to symlinks

* including absolute path in symlinks

* PacBio: Update `pbmm2` to version 0.11.0 (#12246)

* Gimmemotifs version 0.13.0 (#12245)

* new release to fix numpy incompatibility

* GimmeMotifs 0.13.0 release test

* numpy 1.15

* version 0.13.0

* Fix about:home and extra:

* PacBio: Update `isoseq3` to version 3.1.0 (#12263)

* add python to host section (#12264)

* Update confindr to 0.4.7 (#12266)

* new build of GRiD (#12269)

* update grid

* update build num

* edit grid to use mosdepth for depth coverage

* edit grid to use mosdepth for depth coverage

* Update meta.yaml

* new build for grid

* Update meta.yaml

* peptideshaker, searchgui: version bump (#12136)

* peptideshaker, searchgui: version bump

solving compomics/searchgui#192

* Update peptide-shaker.py

* searchgui: added c compiler for building

test if linking errors go away this way

* searchgui: removed noarch

* Update: bcbio, bcbio-vm with CWL non-human fixes (bcbio/bcbio-nextgen#2473) (#12273)

* Make snakemake a noarch package. (#12274)

* noarch: python in combination with version constraint

* Cleanup deps and simplify version definition.

* Genenotebook v0.1.9 (#12277)

* genenotebook recipe

* About

* Package name

* License

* sha256sum

* Update build.sh

* Make bin dir

* Change ln to cp

* New symlink strategy

* Build from prebundled tarball

* genenotebook v0.1.2

* version fix

* retry

* fix

* build number

* genenotebook v0.1.3

* genenotebook v0.1.3

* new build procedure

* change run dependencies

* build

* restore build procedure

* Fix summary

* dependency versions

* v0.1.5

* version fix

* v0.1.6

* reset build number

* build number

* genenotebook v0.1.7

* v0.1.8

* v0.1.9

* Create r-airr recipe (#12284)

* Create r-airr recipe using bgruening/conda_r_skeleton_helper

* Update meta.yaml

* Findbin::Real added (#12291)

* Added version bump to 1.3 (#12292)

* ngs-bits 2018_11 (#12262)

* Updated ngs-bits to version 2018_10

* Updated ngs-bits to version 2018_11

* minor edit in grid (#12287)

* update grid

* update build num

* edit grid to use mosdepth for depth coverage

* edit grid to use mosdepth for depth coverage

* Update meta.yaml

* new build for grid

* Update meta.yaml

* minor edit to grid

* Fred2 update (#12259)

* Update FRED2 Recipe

* msstitch version 2.9 (#12296)

* r-mvr (#12300)

This is a non-parametric method for joint adaptive mean-variance regularization and variance sta-
bilization of high-dimensional data.

* Bump to Gromacs 2018.3 + re-enable OSX build (#12299)

* Gromacs 2018.3 + re-enable osx build

See also #7825 on OSX build

Include RRID for Gromacs (see bioconda/bioconda-utils#252)

* disabled ocl-icd on osx; not available (or needed?)

https://anaconda.org/conda-forge/ocl-icd

* pyGenomeTracks update of dependencies (#12279)

* Update pyGenomeTracks recipe to enable py3 support and updating outdated dependecy list

* Increase build number

* Update meta.yaml

* Exclude Python 3.5

* Gromacs 2018.4 (#12302)

* quast 5.0.2 (#12303)

* Update meta.yaml (#12275)

* Update meta.yaml

r-rsqlite version dependency added

* Update meta.yaml

Updated build number to 1

* Use the extended-base container because of wget issues when downloading databases (#12297)

* Update duphold to 0.1.0 (#12257)

* Update flashlfq to 0.1.109 (#12256)

* Update planemo to 0.57.0 (#12251)

* Update meta.yaml (#12258)

* Update scalpel to 0.5.4 (#12249)

* Update scalpel to 0.5.4

* Fix url

* edit grid (#12298)

* update grid

* update build num

* edit grid to use mosdepth for depth coverage

* edit grid to use mosdepth for depth coverage

* Update meta.yaml

* new build for grid

* Update meta.yaml

* minor edit to grid

* minor edit to recipe grid

* added feather to deps (#12293)

* R tigger (#12286)

* Create recipe for r-tigger (#12271)

* Create recipe for r-tigger using bgruening/conda_r_skeleton_helper.

* Delete bld.bat

* Remove windows support

* update scprep to v0.8.1 (#12306)

* Update nanoget to 1.7.5 (#12252)

* Update nanoget to 1.7.5

* Bump to version 1.7.6

should fix the README.md problem

* gffcompare: version bump (#12308)

* Update: cromwell (36), bcbio (Cromwell Docker runs) (#12307)

* Update: cromwell (36), bcbio (Cromwell Docker runs)

* Pin java to 8 to avoid compile errors on 11

* HiCExplorer update to version 2.2-beta (#12309)

* HiCExplorer update to version 2.2-beta

* Remove of '-' from version number

* Numpy set to 1.15

* Update meta.yaml

* upgrade to version 2.8.1 (#12316)

* Create Airr recipe (#12318)

* Create AIRR recipe using conda skeleton pypi.

* Add noarch python

* Clean host requirements

* Added nimnexus [v0.1] (#12133)

* added nimnexus

* Fix sha

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* [WIP] Add R-loom (#12320)

* initial WIP skeleton

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* try osx

* Create Changeo recipe (#12321)

* Create changeo recipe using conda skeleton pypi.

* EDIT about section

* ADD noarch generic

* Update noarch to python

* Clean host requirements

* Add test for scripts

* Change requirement to python>=3.4

* Add noarch pyton

* r-corbi (#12325)

* r-nam (#12327)

* Update pybedtools to 0.8.0 (#12324)

* Update pybedtools to 0.8.0

* rm cython requirement

* MSGFplus does not work with newer Java (#12330)

* Trying to not use a very new java version since the latest versions have dropped some classes

* Increment build

* Update atropos to 1.1.21 (#12322)

* Update CONCOCT to 0.4.2 (#12317)

* Update CONCOCT to 0.4.2

* Wrong shasum given

* Reset build number for new version

* New: gvcfgenotyper for joint calling with Illumina strelka2 (#12331)

* Apparently we need openJDK to be below 9, not below 11 as in the previous PR (#12332)

* Update Picard to 2.18.17. (#12333)

* Unblacklist IgDiscover (#12335)

* PacBio: Add missing licenses and post-link message (#12334)

* Bioconductor DEqMS package (#12336)

* Bioconductor DEqMS package

* Limma version could be set lower (checked with the bioconductor recipe page) to include limma in bioconda

* Genenotebook v0.1.10 (#12337)

* genenotebook recipe

* About

* Package name

* License

* sha256sum

* Update build.sh

* Make bin dir

* Change ln to cp

* New symlink strategy

* Build from prebundled tarball

* genenotebook v0.1.2

* version fix

* retry

* fix

* build number

* genenotebook v0.1.3

* genenotebook v0.1.3

* new build procedure

* change run dependencies

* build

* restore build procedure

* Fix summary

* dependency versions

* v0.1.5

* version fix

* v0.1.6

* reset build number

* build number

* genenotebook v0.1.7

* v0.1.8

* v0.1.9

* v0.1.10

* removed cloudpickle requirement for blockclust (#12242)

* blockclust latest source and added cloudpickle as requirement

* removed cloudpickle requirement for blockclust

* added a tagged release to the source

* back to source url but with release tarball and checksum

* R guilds (#12341)

* Strict rec (#12113)

* added metawrap

* fixed perl and jdk

* added more strict req

* fixed java and perl

* removed extra boost

* Update qcat to v1.0.1 (#12343)

* Add better error checking (#12345)

* Add mhcnames python package (#12349)

* Add MHCflurry pMHC class I binding affinity prediction tool (#12346)

* Bump pilon (#12354)

* Update pronto to 0.11.1 (#12377)

* Update perl-json-xs to 4.0 (#12373)

* Update planemo to 0.57.1 (#12370)

* Update snippy to 4.3.6 (#12372)

* Update intarna to 2.3.1 (#12367)

* Update agfusion to 1.231 (#12366)

* Update bioconda-utils to 0.15.0 (#12359)

* Update mapdamage2 to 2.0.9 (#12358)

* Update scanpy to 1.3.4 (#12357)

* Update goatools to 0.8.11 (#12361)

* Update perl-math-random-mt-auto to 6.23 (#12362)

* Update kipoi to 0.6.3 (#12374)

* Update kipoi_veff to 0.2.1 (#12364)

* Update perl-object-insideout to 4.05 (#12380)

* Update sevenbridges-python to 0.17.1 (#12379)

* Update nanosv to 1.2.3 (#12378)

* Update starfish to 0.0.30 (#12368)

* Update clust to 1.8.10 (#12356)

* Update pypairs to 2.0.6 (#12381)

* Update khmer to 3.0.0a2 (#12360)

* Update flashlfq to 0.1.110 (#12385)

* Update perl-algorithm-cluster to 1.57 (#12384)

* Update arvados-python-client to 1.2.0.20181121194423 (#12382)

* Update avro-cwl to 1.8.9 (#12355)

* Update avro-cwl to 1.8.9

* Set noarch: python

* r-breakaway (#12386)

* R sads (#12388)

* Update python-sortedcontainers to 2.1.0 (#12383)

* r-ebimetagenomics (#12339)

* pin armadilo on major version (#12389)

* IgDiscover version 0.11 (#12351)

* IgDiscover version 0.11

* IgDiscover requires Python 3.6

* Fix build on macOS with a small patch

* Remove unneeded host dependencies

* Update goatools to 0.8.12 (#12391)

* Update bioconda-utils to 0.15.1 (#12390)

* R ebimetagenomics (#12393)

* Add dimspy recipes for reference purposes (#12387)

* Add dimspy recipes for reference purposes

* Remove filename

* remove from blacklist

* Update build-fail-blacklist

* Added BioExcel_SeqQC to bioconda-recipes (#12276)

* PacBio: Update pbalign to 0.3.2 (#12396)

* PacBio: Update pbalign to 0.3.2

Closes: PacificBiosciences/pbbioconda#39

* add bgreat to bioconda (#12350)

* bgreat addition

* zlib in build should not be needed

* btrim integration to conda (#12395)

* Format sleuth recipe. Trigger rebuild because latest version was never uploaded for some reason. (#12399)

* PacBio: Update `pbsv2` to version 2.1.0 (#12400)

* Genenotebook v0.1.11 (#12398)

* genenotebook recipe

* About

* Package name

* License

* sha256sum

* Update build.sh

* Make bin dir

* Change ln to cp

* New symlink strategy

* Build from prebundled tarball

* genenotebook v0.1.2

* version fix

* retry

* fix

* build number

* genenotebook v0.1.3

* genenotebook v0.1.3

* new build procedure

* change run dependencies

* build

* restore build procedure

* Fix summary

* dependency versions

* v0.1.5

* version fix

* v0.1.6

* reset build number

* build number

* genenotebook v0.1.7

* v0.1.8

* v0.1.9

* v0.1.10

* v0.1.11

* prosolo: new package version 0.6.0 (#12397)

* Update sevenbridges-python to 0.17.2 (#12404)

* Add recipe for ICED (#12406)

* Downgrade version for iced (#12409)

* Update recipe for libstatgen 1.0.5 (#12348)

* Add recipe for libstatgen-1.0.5

* Update iced to 0.5.0 (#12411)

* Update perl-test2-suite to 0.000116 (#12412)

* fix compatibility with other tools (#12415)

* Restore dexseq python helper scripts (#12352)

* Restore dexseq python helper scripts

* Add python requirement for helper scripts

* Add python noarch to build section

* pin it to python2k

* Add htseq requirement, support only python <3 for now

* Genenotebook v0.1.12 (#12418)

* genenotebook recipe

* About

* Package name

* License

* sha256sum

* Update build.sh

* Make bin dir

* Change ln to cp

* New symlink strategy

* Build from prebundled tarball

* genenotebook v0.1.2

* version fix

* retry

* fix

* build number

* genenotebook v0.1.3

* genenotebook v0.1.3

* new build procedure

* change run dependencies

* build

* restore build procedure

* Fix summary

* dependency versions

* v0.1.5

* version fix

* v0.1.6

* reset build number

* build number

* genenotebook v0.1.7

* v0.1.8

* v0.1.9

* v0.1.10

* v0.1.11

* v0.1.12

* updated source

* added medpy recipe (#12417)

* added medpy recipe

* removed osx from build

* Update meta.yaml

* Update meta.yaml

* add boost and a compiler

* Update meta.yaml

* Update meta.yaml

* add itk

* Updated ddrage to version 1.6.1. (#12421)

* Add r-mcpcounter recipe.  (#12261)

* Add MCPcounter recipe.

* fix lint: remove 'fn'

* fix license_file, version number and doi

* Update DEXSeq requirements to force compatible HTSeq version (#12423)

* ARB: Pin glib (#11782)

* Pin glib

* Update meta.yaml

* Update meta.yaml

* Work around CB3 issues

* Work around bioconda-utils lint false positive

* Disable lint check should_not_be_noarch

* Can't reference other packages built in recipe from anything but run

* Move perl to host section. Maybe that helps.

* constraining interpreter version breaks with CB3?

* disable perl version constraint :(

* Update meta.yaml (#12426)

* stacks: fix for fix 'fixing' @ in exe_path (#12420)

* stacks: fix for fix 'fixing' @ in exe_path

the previous fix #11580
tried to solve the problem with the @ in the exe_path by setting this
variable empty. But the perl scripts
- append a / to the empty string and (i.e. the bin path gets /binary)
- check for file presence (problem: neither binary nor /binary are
  present)

Now I quote the @ in the exe_path.

* stacks: add refmap bugfix

* Genenotebook v0.1.13 (#12424)

* genenotebook recipe

* About

* Package name

* License

* sha256sum

* Update build.sh

* Make bin dir

* Change ln to cp

* New symlink strategy

* Build from prebundled tarball

* genenotebook v0.1.2

* version fix

* retry

* fix

* build number

* genenotebook v0.1.3

* genenotebook v0.1.3

* new build procedure

* change run dependencies

* build

* restore build procedure

* Fix summary

* dependency versions

* v0.1.5

* version fix

* v0.1.6

* reset build number

* build number

* genenotebook v0.1.7

* v0.1.8

* v0.1.9

* v0.1.10

* v0.1.11

* v0.1.12

* updated source

* v0.1.13

* metaQuantome (#12413)

replaces metaquant, which is now deprecated.

* New MMseqs2 release 7-4e23d (#12432)

New mmseqs2 release 7-4e23d

* update spades to 3.13.0 (#12408)

* update spades to 3.13.0

* spades: removed dipspades; added 3.12.0 recipe in subdir

* bumped build number for 3.11.1 due to CI failure

* Update Crossmap to 0.3.1 (#12439)

* Update Crossmap to 0.3.1

* Make metaquantome dependencies more specific (#12442)

* less stringent deps

* bump build

* pin goatools

* test extra pins

* integration of bcool to bioconda (#12422)

* Update hifive to 1.5.7 (#12433)

* Update scvi to 0.2.3 (#12434)

* Update Subread from 1.6.2 to 1.6.3. (#12425)

* Update Subread from 1.6.2 to 1.6.3.

* Subread executable coverageCount has been removed.

https://groups.google.com/d/msg/subread/Au1CpKGAXaA/KndhDbrfAwAJ

* UCSC Cell Browser 0.4.23 (#12347)

* Starting files for ucsc-cell-browser

* Customised recipe for ucsc-cell-browser

* meta update for ucsc-cell-browser

* Changes cbTrackHub to cbHub

* Point to static release

* Moves to 0.25

* Updates sha hash for ucsc-browser

* Reverts to 0.1.9

* At ebi

* Update meta.yaml

* pins numpy to last know working version

* Back to 0.25, no numpy pinning.

* Remove git, pin python and numpy. Add build section.

* Update meta.yaml

* Update meta.yaml

* Move to 0.4.20 without skeleton on provisional commit

* Typo on executable name

* Moves to release version and adds dependencies for converters

* Update libstatgen to 1.0.14 (#12437)

* Update arvados-python-client to 1.2.1 (#12438)

* Update tirmite to 1.1.3 (#12435)

* Bump metaquantome (#12449)

* less stringent deps

* bump build

* pin goatools

* test extra pins

* bump metaquantome to 0.99.3

* Update ADAM to 0.25.0 (#12445)

* Update: bcbio, bcbio-vm with viral QC, Docker fixes (#12453)

* update rgi4.2.2 recipe (#12282)

* update to enforce python3.6 and load card.json

* remove lines from build.sh

* update build number

* make PR comment changes

* add test to verify db is loaded

* update build script

* update build script

* fix lint errors

* fix linting errors

* fix liniting errors

* fix linting errors

* fix linting errors

* fix linting errors

* fix linting errors

* fix linting errors

* fix linting errors

* fix linting errors

* fix linting errors

* revert test change

* Add CRAN R package leapp. (#12428)

* Bump DECIPHER to Bioconductor 3.7 (#12463)

* Bump DECIPHER to Bioconductor 3.7

* Change r-rsqlite version dependency

* Update meta.yaml

* updated scVI to 0.2.3 (#12429)

* Add recipe for PyAAVF 0.1.0 (#12451)

* Add recipe for PyAAVF 0.1.0

* Fix linting error

* seqkit 0.9.3 (#12455)

* HTSeq - Pin numpy version (#12467)

* update eigensoft to 7.2.1 (#12469)

* Update scnic to 0.6.0 (#12436)

* Update scnic to 0.6.0

* Work around fastspar insufficient armadillo pin

* fix pin

* GimmeMotifs 0.13.1 (#12471)

* Bumped version to newest, with introspective text (#12461)

* Update Segway 2.0.2 recipe to use older depedencies (#11803)

* Update Segway 2.0.2 recipe to use older depedencies

* Update maximum genomedata version supported

* Update build number

* Fix missing older dependency information

* Update meta.yaml

* updating dependencies for reparation_blast (#12443)

* updating dependencies for reparation_blast

* changed dependency of pysam

* Added DropletUtils package (#12448)

* Create recipe for SVIM (#12272)

* add svim recipe

* fix bugs in svim recipe

* replace source file

* remove license file

* Lowercase biopython

* upgrade to svim 0.4.1

* add GPL LICENSE file, allow python 3.6.* patch releases, add minimap2 dependency

* fix python version

* try to fix lint error

* replace "skip: True" with "noarch: python"

* Update bioconductor-biocgenerics (#12477)

* [X] I have read the [guidelines for bioconda recipes](https://bioconda.github.io/guidelines.html).
* [ ] This PR adds a new recipe.
* [ ] AFAIK, this recipe **is directly relevant to the biological sciences** (otherwise, please submit to the more general purpose [conda-forge channel](https://conda-forge.org/docs/)).
* [X] This PR updates an existing recipe.
* [ ] This PR does something else (explain below).

* Irida sistr results 0.6.0 (#12478)

* Update irida-sistr-results to 0.6.0

* Fixed dependency string

* Adding recipe for clinvar-tsv v0.1.0. (#12481)

* no fixed boost version (#12483)

* New recipe: pysradb (#12470)

* Pyseer 1.2.0 (#12444)

* Update pyseer to 1.1.2

* Update pyseer to 1.2.0

* Update pyseer to 1.2.0

Update pyseer to 1.2.0

Update pyseer to 1.2.0 (fixed)

* Final touches to pyseer 1.2.0

* Additional update to pyseer recipe

* sentieon: minor version bump to 201808.01 (#12480)

* adding java to mutations recipe (#12485)

* Update ncrf to 1.00.06 (#12369)

* Update ncrf to 1.00.06

* Update test string

* Bump umitools (#12486)

* Mob suite version 1.4.9 no arch build number 2 (#12479)

* Updated to version 1.4.9

* Downgraded to python >= 3.4 to accomodate lowandrew request

* new build

* no arch commit for python 3.4

* no arch conda package version

* Set build number to 1

* Update r-goeveg to 0.4.2 (#12375)

* Update r-goeveg to 0.4.2

* Add r-hmisc as dependency

* Removes r-seurat-scripts from blacklisting (#12489)

* Removes r-seurat-scripts from blacklisting

* Unblacklist some non bioconductor packages

* For VarDict-Java, install utility scripts. (#12488)

* For VarDict-Java, install utility scripts.

Utility script that were previously only shipped with VarDict are not
part of VarDict-Java as well, making installation of both packages
unnecessary.

* Bump build.

* Improve variable name.

* Make vardict depend on vardict-java for utility scripts.

* Pin compatible numpy for older HTSEQ versions (#12490)

* Pin compatible numpy for older HTSEQ versions

This is required to keep older versions of HTSEQ functional

* HTSeq - update old versions to build properly with new build system

* HTSeq 0.6.1 - Increment build number

* add recipe for r-epic (#12473)

* Add new recipe - RVTESTS (#12465)

* Add rvtests

* Update pre-req

* Update meta.yaml

* Update build.sh

* Update build.sh

* Update meta.yaml

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update build.sh

* Update build.sh

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Create LICENSE

* Update meta.yaml

* Update meta.yaml

* Update build.sh

* Update build.sh

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Update build.sh

* Create run_test.sh

* Update meta.yaml

* Delete run_test.sh

* Update build.sh

* Update meta.yaml

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Update build.sh

* Update meta.yaml

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update build.sh

* Update meta.yaml

* Update meta.yaml

* Update meta.yaml

* Adding recipe for var-agg v0.1.0 (#12482)

* Adding recipe for var-agg v0.1.0

* runtime deps do have run-exports defined

* move rust into the host section

* Adding dependency to clangdev

* Adding var-agg v0.1.1. (#12495)

* [WIP] Salmon v0.12.0 --- try to fix OSX build for real (#12441)

Salmon v0.12.0 --- version bump and fix OSX build

* Considerable updates in terms of features and fixes (check release notes).
* Fix OSX build that would compile, but segfault in quant (but only when built on "old" OSX).

* Performancean alytics (#12498)

* Add bioconductor package RNASeqR

* Add performanceanalytics

* Performancean alytics (#12499)

* Add bioconductor package RNASeqR

* Add performanceanalytics

* Move recipes/r-performanceanalytics and add r-rafalib

* Move r-performanceanalytics/ to recipes

* Update recipes/r-performanceanalytics

* Update r-rafalib

* Remove recipes/r-performanceanalytics/bld.bat

* Remove recipes/r-performanceanalytics

* Update r-performanceanalytics/

* Update r-performanceanalytics/

* Add r-performanceanalytics/

* remove r-performanceanalytics

* Add performanceanalytics

* Remove r-performanceanalytics

* Update Picard to 2.18.20. (#12501)

* Update vsearch to 2.10.0. (#12504)

* Update Purge Haplotigs to v1.0.4 (#12497)

* Update duphold to 0.1.1 (#12516)

* Update starfish to 0.0.31 (#12515)

* Update perl-term-app-roles to 0.02 (#12512)

* Update perl-dbd-sqlite to 1.60 (#12511)

* Update perl-json to 4.00 (#12509)

* Update kipoi to 0.6.5 (#12508)

* Update illumina-interop to 1.1.8 (#12507)

* Update beagle-lib to 3.1.2 (#12523)

* Update pysradb to 0.3.0 (#12522)

* Update bioconda-utils to 0.15.2 (#12525)

* Update perl-json-pp to 4.0 (#12524)

* Update perl-term-table to 0.013 (#12529)

* Update abyss to 2.1.5 (#12528)

* Update perl-test2-suite to 0.000117 (#12527)

* Update wtforms-alchemy to 0.16.8 (#12505)

* Update perl-carp-clan to 6.07 (#12518)

* Update perl-date-manip to 6.75 (#12519)

* Update to bioconductor 3.8, use gcc7 in bulk

* Should have double checked the circleci yaml

* fix a few shell files

* copy over bulk change
@pirovc pirovc mentioned this pull request Jul 12, 2019
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants