-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for read alignment and variant calling in Adam? (e.g. BWA + Freebayes) #1311
Comments
Check out CS-BWAMEM, it needs some updating but is an implementation of bwa via spark/adam. |
+1 @waltermblair. I've got a WIP update PR at ytchen0323/cloud-scale-bwamem#9 |
Will this be integrated with the ADAM project itself? Alignment with BWA is the critical missing link in ADAM. |
Long term perhaps the cloud-scale-bwamem repository may migrate under the bigdatagenomics organization, to facilitate support and tighter integration with ADAM release cycles. It is not likely the code with be migrated into the adam repository though, as most applications are developed as separate repositories. Note there are a few other options to integrating BWA and ADAM: BWA with ADAM on Apache Spark using workflow engineBWA and ADAM can be run as part of the same pipeline, as is demonstrated here, with Toil as the workflow engine and Docker as the container technology: Docker images for this pipeline are developed in the cgl-docker-lib repository and hosted on quay.io. ADAM on Apache Spark with BWA using ADAM Pipe APIAn alternative execution model is being developed in the cannoli repository, where the data are partitioned using Apache Spark and ADAM and then streamed over pipes to an external BWA process on each compute node. This takes advantage of the ADAM Pipe API, which in turn builds on Apache Spark's Reimplement BWA algorithm on ADAM on Apache SparkAnother option would be to reimplement the BWA algorithm in Scala on ADAM on Apache Spark. We currently have no plans to do this. If someone is interested and willing however, ... :) |
In addition to calling the native BWA code, CS-bwamem has a Scala implementation of several of the core BWA algos. |
Thank you @fnothaft and @heuermh for this information on how to run BWA and Adam together on a Spark cluster. I look forward to trying one or more of these options later this year to run a read alignment(bwa) and variant calling pipeline(freebayes/gatk) on a Spark cluster. I see that GATK is supported downstream and that also a Freebayes wrapper is being developer in the canoli repository. |
Thank you @NeillGibson for asking good questions! Ping us when you're ready to give things a go, maybe the story will be clearer by then. Meanwhile, if you might be interested, we host a weekly video call for our team and collaborators. Email my username at berkeley.edu for details. |
How can I implement Cannoli in ADAM, please help. |
Hi,
Are you planning to support read alignment and variant calling in Adam? For example with BWA and Freebayes?
As far as I know most development work in Adam is focused on:
And that the focus is not on not developing new software for read alignment or variant calling.
I did see that work was done on adding pipes for stream FASTQ, BAM and VCF to legacy tools.
#1112
Are you planning to support / test / develop read alignment + variant calling pipelines on Spark + Adam that make use of external read aligners / variant callers + your own data formats + bam post processing tools?
For Spark + Adam to be a real alternative to a normal HPC cluster for genomics data analysis read alignment + variant calling support is essential.
Thank you.
The text was updated successfully, but these errors were encountered: