Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tiddit #10

Merged
merged 12 commits into from
Aug 7, 2019
15 changes: 8 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,37 +12,38 @@ Initial release of `nf-core/sarek`, created with the [nf-core](http://nf-co.re/)
### `Added`

- [#2](https://github.com/nf-core/sarek/pull/2) - Create `nf-core/sarek` `environment.yml` file
- [#2](https://github.com/nf-core/sarek/pull/2), [#3](https://github.com/nf-core/sarek/pull/3), [#4](https://github.com/nf-core/sarek/pull/4), [#5](https://github.com/nf-core/sarek/pull/5), [#7](https://github.com/nf-core/sarek/pull/7), [#9](https://github.com/nf-core/sarek/pull/9), [#11](https://github.com/nf-core/sarek/pull/11), [#12](https://github.com/nf-core/sarek/pull/12) - Add CI for `nf-core/sarek`
- [#2](https://github.com/nf-core/sarek/pull/2), [#3](https://github.com/nf-core/sarek/pull/3), [#4](https://github.com/nf-core/sarek/pull/4), [#5](https://github.com/nf-core/sarek/pull/5), [#7](https://github.com/nf-core/sarek/pull/7), [#9](https://github.com/nf-core/sarek/pull/9), [#10](https://github.com/nf-core/sarek/pull/10), [#11](https://github.com/nf-core/sarek/pull/11), [#12](https://github.com/nf-core/sarek/pull/12) - Add CI for `nf-core/sarek`
- [#3](https://github.com/nf-core/sarek/pull/3) - Add preprocessing to `nf-core/sarek`
- [#4](https://github.com/nf-core/sarek/pull/4) - Add variant calling to `nf-core/sarek` with `HaplotypeCaller`, and single mode `Manta` and `Strelka`
- [#5](https://github.com/nf-core/sarek/pull/5) - Add variant calling to `nf-core/sarek` with `Manta`, `Strelka`, `Strelka Best Practices`, `MuTecT2`, `FreeBayes`, `ASCAT`, `ControlFREEC`
- [#6](https://github.com/nf-core/sarek/pull/6) - Add default containers for annotation to `nf-core/sarek`
- [#7](https://github.com/nf-core/sarek/pull/7) - Add annotation
- [#7](https://github.com/nf-core/sarek/pull/7) - Add MultiQC
- [#7](https://github.com/nf-core/sarek/pull/7) - Add annotation
- [#7](https://github.com/nf-core/sarek/pull/7) - Add social preview image in `png` and `svg` format
- [#7](https://github.com/nf-core/sarek/pull/7), [#8](https://github.com/nf-core/sarek/pull/8), [#11](https://github.com/nf-core/sarek/pull/11) - Add helper script `run_tests.sh` to run different tests
- [#7](https://github.com/nf-core/sarek/pull/7), [#8](https://github.com/nf-core/sarek/pull/8), [#9](https://github.com/nf-core/sarek/pull/9) - Add automatic build of specific containers for annotation for `GRCh37`, `GRCh38` and `GRCm38` using `CircleCI`
- [#7](https://github.com/nf-core/sarek/pull/7), [#8](https://github.com/nf-core/sarek/pull/8), [#9](https://github.com/nf-core/sarek/pull/9), [#11](https://github.com/nf-core/sarek/pull/11) - Add helper script `build_reference.sh` to build small reference from [nf-core/test-datasets:sarek](https://github.com/nf-core/test-datasets/tree/sarek)
- [#7](https://github.com/nf-core/sarek/pull/7), [#9](https://github.com/nf-core/sarek/pull/9), [#11](https://github.com/nf-core/sarek/pull/11), [#12](https://github.com/nf-core/sarek/pull/12) - Add helper script `download_image.sh` to download containers for testing
- [#8](https://github.com/nf-core/sarek/pull/8) - Add test configation for easier testing
- [#9](https://github.com/nf-core/sarek/pull/9), [#11](https://github.com/nf-core/sarek/pull/11) - Add scripts for `ASCAT`
- [#10](https://github.com/nf-core/sarek/pull/10) - Add `TIDDIT` to detect structural variants
- [#11](https://github.com/nf-core/sarek/pull/11) - Add automatic build of specific containers for annotation for `CanFam3.1` using `CircleCI`
- [#11](https://github.com/nf-core/sarek/pull/11), [#12](https://github.com/nf-core/sarek/pull/12) - Add posters and abstracts
- [#12](https://github.com/nf-core/sarek/pull/12) - Use `label` for processes configation
- [#12](https://github.com/nf-core/sarek/pull/12) - Add helper scripts `filter_locifile.py` and `selectROI.py`
- [#12](https://github.com/nf-core/sarek/pull/12) - Add helper script `make_snapshot.sh` to make an archive for usage on a secure cluster
- [#12](https://github.com/nf-core/sarek/pull/12) - Add helper scripts `filter_locifile.py` and `selectROI.py`
- [#12](https://github.com/nf-core/sarek/pull/12) - Use `label` for processes configation
- [#13](https://github.com/nf-core/sarek/pull/13) - Add Citation documentation
- [#13](https://github.com/nf-core/sarek/pull/13) - Add `BamQC` process
- [#13](https://github.com/nf-core/sarek/pull/13) - Add `CompressVCFsnpEff` and `CompressVCFvep` processes
- [#18](https://github.com/nf-core/sarek/pull/18) - Add `--no-reports` option for tests + add snpEff,VEP,merge to MULTIPLE test
- [#18](https://github.com/nf-core/sarek/pull/18) - Add possibility to download other genome for `sareksnpeff` and `sarekvep` containers
- [#18](https://github.com/nf-core/sarek/pull/18) - Add params `--skip` to skip specified QC tools
- [#18](https://github.com/nf-core/sarek/pull/18) - Add logo to MultiQC report
- [#18](https://github.com/nf-core/sarek/pull/18) - Add params `--skip` to skip specified QC tools
- [#18](https://github.com/nf-core/sarek/pull/18) - Add possibility to download other genome for `sareksnpeff` and `sarekvep` containers
- [#20](https://github.com/nf-core/sarek/pull/20) - Add `markdownlint` config file

### `Changed`

- [#1](https://github.com/nf-core/sarek/pull/1), [#2](https://github.com/nf-core/sarek/pull/2), [#3](https://github.com/nf-core/sarek/pull/3), [#4](https://github.com/nf-core/sarek/pull/4), [#5](https://github.com/nf-core/sarek/pull/5), [#6](https://github.com/nf-core/sarek/pull/6), [#7](https://github.com/nf-core/sarek/pull/7), [#8](https://github.com/nf-core/sarek/pull/8), [#9](https://github.com/nf-core/sarek/pull/9), [#11](https://github.com/nf-core/sarek/pull/11), [#12](https://github.com/nf-core/sarek/pull/12), [#18](https://github.com/nf-core/sarek/pull/18), [#20](https://github.com/nf-core/sarek/pull/20) - Update docs
- [#1](https://github.com/nf-core/sarek/pull/1), [#2](https://github.com/nf-core/sarek/pull/2), [#3](https://github.com/nf-core/sarek/pull/3), [#4](https://github.com/nf-core/sarek/pull/4), [#5](https://github.com/nf-core/sarek/pull/5), [#6](https://github.com/nf-core/sarek/pull/6), [#7](https://github.com/nf-core/sarek/pull/7), [#8](https://github.com/nf-core/sarek/pull/8), [#9](https://github.com/nf-core/sarek/pull/9), [#10](https://github.com/nf-core/sarek/pull/10), [#11](https://github.com/nf-core/sarek/pull/11), [#12](https://github.com/nf-core/sarek/pull/12), [#18](https://github.com/nf-core/sarek/pull/18), [#20](https://github.com/nf-core/sarek/pull/20) - Update docs
- [#4](https://github.com/nf-core/sarek/pull/4) - Update `cancerit-allelecount` from `2.1.2` to `4.0.2`
- [#4](https://github.com/nf-core/sarek/pull/4) - Update `gatk4` from `4.1.1.0` to `4.1.2.0`
- [#7](https://github.com/nf-core/sarek/pull/7) - `--sampleDir` is now deprecated, use `--sample` instead
Expand Down
2 changes: 2 additions & 0 deletions docs/containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ For annotation, the main container can be used, but the cache has to be download
- Contain **[samtools][samtools-link]** 1.9
- Contain **[snpEff][snpeff-link]** 4.3.1t
- Contain **[Strelka2][strelka-link]** 2.9.10
- Contain **[TIDDIT][tiddit-link]** 2.7.1
- Contain **[VCFanno][vcfanno-link]** 0.3.1
- Contain **[VCFtools][vcftools-link]** 0.1.16
- Contain **[VEP][vep-link]** 96.0
Expand Down Expand Up @@ -111,6 +112,7 @@ The `environment.yml` file can easilly be modified if particular versions of too
[sareksnpeff-docker-badge]: https://img.shields.io/docker/automated/nfcore/sareksnpeff.svg
[sareksnpeff-docker-link]: https://hub.docker.com/r/nfcore/sareksnpeff
[strelka-link]: https://github.com/Illumina/strelka
[tiddit-link]: https://github.com/SciLifeLab/TIDDIT
[vcfanno-link]: https://github.com/brentp/vcfanno
[vcftools-link]: https://vcftools.github.io/index.html
[vep-link]: https://github.com/Ensembl/ensembl-vep
Expand Down
43 changes: 41 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ The pipeline processes data using the following steps:
* [`Strelka2`](#Strelka2)
* Structural variants
* [`Manta`](#Manta)
* [`TIDDIT`](#TIDDIT)
* Sample heterogeneity, ploidy and CNVs
* `alleleCounter`
* [`ConvertAlleleCounts`](#ConvertAlleleCounts)
Expand All @@ -48,6 +49,7 @@ The pipeline processes data using the following steps:
* [`MultiQC`](#MultiQC)

## Preprocessing

Sarek preprocesses raw FastQ files or unmapped BAM files, based on [GATK best practices](https://software.broadinstitute.org/gatk/best-practices/).

BAM files with Recalibration tables can also be used as an input to start with the recalibration of said BAM files, for more information see [TSV files output information](#TSV-files)
Expand Down Expand Up @@ -79,6 +81,7 @@ For all samples:
* BAM file and index

### TSV files

The TSV files are autogenerated and can be used by Sarek for further processing and/or variant calling.

For further reading and documentation see the [input documentation](https://github.com/nf-core/sarek/blob/master/docs/input.md).
Expand All @@ -89,14 +92,16 @@ For all samples:
* `duplicateMarked.tsv` and `recalibrated.tsv`
* TSV files to start Sarek from `recalibration` or `variantcalling` steps.
* `duplicateMarked_[SAMPLE].tsv` and `recalibrated_[SAMPLE].tsv`
* TSV files to start Sarek from `recalibration` or `variantcalling` steps for a specific sample.
* TSV files to start Sarek from `recalibration` or `variantcalling` steps for a specific sample.

## Variant Calling

All the results regarding variant-calling are collected in this directory.

Recalibrated BAM files can also be used as an input to start the Variant Calling, for more information see [TSV files output information](#TSV-files)

### FreeBayes

[FreeBayes](https://github.com/ekg/freebayes) is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment..

For further reading and documentation see the [FreeBayes manual](https://github.com/ekg/freebayes/blob/master/README.md#user-manual-and-guide).
Expand All @@ -108,6 +113,7 @@ For a Tumor/Normal pair only:
* VCF with Tabix index

### HaplotypeCaller

[GATK HaplotypeCaller](https://github.com/broadinstitute/gatk) calls germline SNPs and indels via local re-assembly of haplotypes.

Germline calls are provided for all samples, to able comparison of both tumor and normal for possible mixup.
Expand All @@ -121,6 +127,7 @@ For all samples:
* VCF with Tabix index

### GenotypeGVCFs

[GATK GenotypeGVCFs](https://github.com/broadinstitute/gatk) performs joint genotyping on one or more samples pre-called with HaplotypeCaller.

Germline calls are provided for all samples, to able comparison of both tumor and normal for possible mixup.
Expand All @@ -134,6 +141,7 @@ For all samples:
* VCF with Tabix index

### MuTect2

[GATK MuTect2](https://github.com/broadinstitute/gatk) calls somatic SNVs and indels via local assembly of haplotypes.

For further reading and documentation see the [MuTect2 manual](https://software.broadinstitute.org/gatk/documentation/tooldocs/4.1.2.0/org_broadinstitute_hellbender_tools_walkers_mutect_Mutect2.php).
Expand All @@ -144,7 +152,33 @@ For a Tumor/Normal pair only:
* `MuTect2_[TUMORSAMPLE]_vs_[NORMALSAMPLE].vcf.gz` and `MuTect2_[TUMORSAMPLE]_vs_[NORMALSAMPLE].vcf.gz.tbi`
* VCF with Tabix index

### TIDDIT

[TIDDIT](https://github.com/SciLifeLab/TIDDIT)identifies intra and inter-chromosomal translocations, deletions, tandem-duplications and inversions.

Germline calls are provided for all samples, to able comparison of both tumor and normal for possible mixup.
Low quality calls are removed internally, to simplify processing of variant calls but they are saved by Sarek.

For further reading and documentation see the [TIDDIT manual](https://github.com/SciLifeLab/TIDDIT/blob/master/README.md).

For all samples:
**Output directory: `results/VariantCalling/[SAMPLE]/TIDDIT`**

* `TIDDIT_[SAMPLE].vcf.gz` and `TIDDIT_[SAMPLE].vcf.gz.tbi`
* VCF with Tabix index
* `TIDDIT_[SAMPLE].signals.tab`
* tab file describing coverage across the genome, binned per 50 bp
* `TIDDIT_[SAMPLE].ploidy.tab`
* tab file describing the estimated ploïdy and coverage across each contig
* `TIDDIT_[SAMPLE].old.vcf`
* VCF including the low qualiy calls
* `TIDDIT_[SAMPLE].wig`
* wiggle file containing coverage across the genome, binned per 50 bp
* `TIDDIT_[SAMPLE].gc.wig`
* wiggle file containing fraction of gc content, binned per 50 bp

### Strelka2

[Strelka2](https://github.com/Illumina/strelka) is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs.

For further reading and documentation see the [Strelka2 user guide](https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md).
Expand All @@ -167,15 +201,17 @@ For a Tumor/Normal pair:

Using [Strelka Best Practices](https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md#somatic-configuration-example) with the `candidateSmallIndels` from `Manta`:
**Output directory: `results/VariantCalling/[TUMOR_vs_NORMAL]/Strelka`**

* `StrelkaBP_[TUMORSAMPLE]_vs_[NORMALSAMPLE]_somatic_indels.vcf.gz` and `StrelkaBP_[TUMORSAMPLE]_vs_[NORMALSAMPLE]_somatic_indels.vcf.gz.tbi`
* VCF with Tabix index
* `StrelkaBP_[TUMORSAMPLE]_vs_[NORMALSAMPLE]_somatic_snvs.vcf.gz` and `StrelkaBP_[TUMORSAMPLE]_vs_[NORMALSAMPLE]_somatic_snvs.vcf.gz.tbi`
* VCF with Tabix index

### Manta

[Manta](https://github.com/Illumina/manta) calls structural variants (SVs) and indels from mapped paired-end sequencing reads.
It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs.
`Manta` provides a candidate list for small indels also that can be fed to `Strelka` following [Strelka Best Practices](https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md#somatic-configuration-example.
`Manta` provides a candidate list for small indels also that can be fed to `Strelka` following [Strelka Best Practices](https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md#somatic-configuration-example).

For further reading and documentation see the [Manta user guide](https://github.com/Illumina/manta/blob/master/docs/userGuide/README.md).

Expand All @@ -188,10 +224,12 @@ For all samples:
* VCF with Tabix index

For Normal sample only:

* `Manta_[NORMALSAMPLE].diploidSV.vcf.gz` and `Manta_[NORMALSAMPLE].diploidSV.vcf.gz.tbi`
* VCF with Tabix index

For a Tumor sample only:

* `Manta_[TUMORSAMPLE].tumorSV.vcf.gz` and `Manta_[TUMORSAMPLE].tumorSV.vcf.gz.tbi`
* VCF with Tabix index

Expand All @@ -208,6 +246,7 @@ For a Tumor/Normal pair only:
* VCF with Tabix index

### ConvertAlleleCounts

[ConvertAlleleCounts](https://github.com/nf-core/sarek/blob/master/bin/convertAlleleCounts.r) is a R-script for converting output from AlleleCount to BAF and LogR values.

For a Tumor/Normal pair only:
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,6 @@ dependencies:
- samtools=1.9
- snpeff=4.3.1t
- strelka=2.9.10
- tiddit=2.7.1
- vcfanno=0.3.1
- vcftools=0.1.16
Loading