Skip to content

Commit

Permalink
Merge pull request #64 from uab-cgds-worthey/decouple_readme_readthedocs
Browse files Browse the repository at this point in the history
Decouples github readme from readthedocs
  • Loading branch information
ManavalanG authored Mar 2, 2023
2 parents baf972a + 1dd1757 commit 7a123d6
Show file tree
Hide file tree
Showing 5 changed files with 106 additions and 51 deletions.
69 changes: 20 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
[![Snakemake](https://img.shields.io/badge/snakemake-6.0.5-brightgreen.svg?style=flat)](https://snakemake.readthedocs.io)
[![ReadTheDocs](https://readthedocs.org/projects/quac/badge/?version=latest)](https://quac.readthedocs.io/en/stable/)


# QuaC

🦆🦆 Don't duck that QC thingy 🦆🦆


!!! Note

In the past life, QuaC repo used to live at [UAB
Gitlab](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/quac). It was
migrated to Github in Jan 2023, and the Gitlab version has been archived.
> **_NOTE:_** In a past life, QuaC used a different remote Git management provider, [UAB
> Gitlab](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/public/quac). It was migrated to
> Github in Jan 2023, and the Gitlab version has been archived.

## What is QuaC?
Expand All @@ -20,68 +22,37 @@ In summary, QuaC performs the following:
- Runs several QC tools using `BAM` and `VCF` files as input. At our center CGDS, these files are produced as part of
the [small variant caller
pipeline](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/small_variant_caller_pipeline).
- Using [QuaC-Watch](./quac_watch.md) tool, it performs QC checkup based on the expected thresholds for certain QC metrics and summarizes
- Using [QuaC-Watch](./docs/quac_watch.md) tool, it performs QC checkup based on the expected thresholds for certain QC metrics and summarizes
the results for easier human consumption
- Aggregates QC output as well as QuaC-Watch output using MulitQC, both at the sample level and project level.
- Optionally, above mentioned QuaC-Watch and QC aggregation steps can accept pre-run results from few QC tools (fastqc,
fastq-screen, picard's markduplicates) when run with flag `--include_prior_qc`.


!!! note "CGDS users only"

* At CGDS, BAM and VCF files produced by the
[small variant caller pipeline](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/small_variant_caller_pipeline)
are used as input to QuaC.
* Tools fastqc, fastq-screen, and picard's markduplicates, whose output are accepted by QuaC when used with
flag `--include_prior_qc`, are produced by this small_variant_caller_pipeline.

!!! info

QuaC is built to use with Human WGS/WES data. If you would like to use it with non-human data, please modify the pipeline as needed -- especially the thresholds used in QuaC-Watch configs.


## QC tools

### Tools run by QuaC

QuaC quacks using the tools listed below:
> **_NOTE:_** QuaC is built to use with Human WGS/WES data. If you would like to use it with non-human data, please
> modify the pipeline as needed -- especially the thresholds used in QuaC-Watch configs.
| Tool | Use | QC Type |
| -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | ---------------------------------------- |
| [Qualimap](http://qualimap.conesalab.org/) | Summarizes several alignment metrics using BAM file | BAM quality |
| [Picard-CollectMultipleMetrics](https://broadinstitute.github.io/picard/command-line-overview.html#CollectMultipleMetrics) | Summarizes alignment metrics from BAM file using several modules | BAM quality |
| [Picard-CollectWgsMetrics](https://broadinstitute.github.io/picard/command-line-overview.html#CollectWgsMetrics) | Collects metrics about coverage and performance using BAM file | BAM quality |
| [mosdepth](https://github.com/brentp/mosdepth) | Fast alignment depth calculation using BAM file | BAM quality |
| [indexcov](https://github.com/brentp/goleft/tree/master/indexcov) | Estimate coverage from BAM index for GS <br />(*Skipped in exome mode*) | BAM quality |
| [covviz](https://github.com/brwnj/covviz) | Identifies large, coverage-based anomalies for GS using Indexcov output <br />(*Skipped in exome mode*) | BAM quality |
| [bcftools stats](https://samtools.github.io/bcftools/bcftools.html#stats) | Summarizes VCF file stats | VCF quality |
| [verifybamid](https://github.com/Griffan/VerifyBamID) | Estimates within-species (i.e., cross-sample) contamination using BAM file | Within-species contamination |
| [somalier](https://github.com/brentp/somalier) | Estimation of sex, ancestry and relatedness using BAM file | Sex, ancestry and relatedness estimation |

## Documentation

### Optional QC output consumed by QuaC
Full documentation, including installation and how to run QuaC, is available at https://quac.readthedocs.io.

Optionally QuaC can also utilize QC results produced by the tools listed below when run with flag `--include_prior_qc`.

## Repo owner

| Tool | Use | QC Type |
| ------------------------------------------------------------------------------------------------------------ | ------------------------------------------------- | ------------- |
| [fastqc](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) | Performs QC on raw sequence reads data (FASTQ) | FASTQ quality |
| [FastQ Screen](https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/) | Screens FASTQ for other-species contamination | FASTQ quality |
| [Picard's MarkDuplicates](https://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates) | Determines level of read duplication on BAM files | BAM quality |
* **Mana**valan Gajapathy


!!! note "CGDS users only"
## License

* At CGDS, these optional tools were run by our small_variant_caller_pipeline.
[GNU GPLv3](./LICENSE)


## Documentation

Full documentation, including installation and how to run QuaC, is available at https://quac.readthedocs.io.
## Contributing

See [here](./docs/CONTRIBUTING.md) for contributing guidelines.

## Repo owner

* **Mana**valan Gajapathy
## Changelog

See [here](./docs/Changelog.md)
2 changes: 1 addition & 1 deletion docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing Guidelines

:grin: :tada: Thank you for taking the time to contribute! :grin: :tada:
😁 🎉 Thank you for taking the time to contribute! 😁 🎉

The following is a set of guidelines for contributing to QuaC.

Expand Down
5 changes: 5 additions & 0 deletions docs/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ YYYY-MM-DD John Doe
```
---

2023-03-01 Manavalan Gajapathy

* Decouples readme.md from readthedocs setup


2023-02-28 Manavalan Gajapathy

* Adds license
Expand Down
78 changes: 77 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1 +1,77 @@
{!README.md!}
# QuaC

🦆🦆 Don't duck that QC thingy 🦆🦆


!!! Note

In a past life, QuaC used a different remote Git management provider, [UAB
Gitlab](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/public/quac). It was
migrated to Github in Jan 2023, and the Gitlab version has been archived.


## What is QuaC?

QuaC is a snakemake-based pipeline that runs several QC tools for WGS/WES samples and then summarizes their results
using pre-defined, configurable QC thresholds.

In summary, QuaC performs the following:

- Runs several QC tools using `BAM` and `VCF` files as input. At our center CGDS, these files are produced as part of
the [small variant caller
pipeline](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/small_variant_caller_pipeline).
- Using [QuaC-Watch](./quac_watch.md) tool, it performs QC checkup based on the expected thresholds for certain QC metrics and summarizes
the results for easier human consumption
- Aggregates QC output as well as QuaC-Watch output using MulitQC, both at the sample level and project level.
- Optionally, above mentioned QuaC-Watch and QC aggregation steps can accept pre-run results from few QC tools (fastqc,
fastq-screen, picard's markduplicates) when run with flag `--include_prior_qc`.


!!! note "CGDS users only"

* At CGDS, BAM and VCF files produced by the
[small variant caller pipeline](https://gitlab.rc.uab.edu/center-for-computational-genomics-and-data-science/sciops/pipelines/small_variant_caller_pipeline)
are used as input to QuaC.
* Tools fastqc, fastq-screen, and picard's markduplicates, whose output are accepted by QuaC when used with
flag `--include_prior_qc`, are produced by this small_variant_caller_pipeline.

!!! info

QuaC is built to use with Human WGS/WES data. If you would like to use it with non-human data, please modify the pipeline as needed -- especially the thresholds used in QuaC-Watch configs.


## QC tools

### Tools run by QuaC

QuaC quacks using the tools listed below:

| Tool | Use | QC Type |
| -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | ---------------------------------------- |
| [Qualimap](http://qualimap.conesalab.org/) | Summarizes several alignment metrics using BAM file | BAM quality |
| [Picard-CollectMultipleMetrics](https://broadinstitute.github.io/picard/command-line-overview.html#CollectMultipleMetrics) | Summarizes alignment metrics from BAM file using several modules | BAM quality |
| [Picard-CollectWgsMetrics](https://broadinstitute.github.io/picard/command-line-overview.html#CollectWgsMetrics) | Collects metrics about coverage and performance using BAM file | BAM quality |
| [mosdepth](https://github.com/brentp/mosdepth) | Fast alignment depth calculation using BAM file | BAM quality |
| [indexcov](https://github.com/brentp/goleft/tree/master/indexcov) | Estimate coverage from BAM index for GS <br />(*Skipped in exome mode*) | BAM quality |
| [covviz](https://github.com/brwnj/covviz) | Identifies large, coverage-based anomalies for GS using Indexcov output <br />(*Skipped in exome mode*) | BAM quality |
| [bcftools stats](https://samtools.github.io/bcftools/bcftools.html#stats) | Summarizes VCF file stats | VCF quality |
| [verifybamid](https://github.com/Griffan/VerifyBamID) | Estimates within-species (i.e., cross-sample) contamination using BAM file | Within-species contamination |
| [somalier](https://github.com/brentp/somalier) | Estimation of sex, ancestry and relatedness using BAM file | Sex, ancestry and relatedness estimation |


### Optional QC output consumed by QuaC

Optionally QuaC can also utilize QC results produced by the tools listed below when run with flag `--include_prior_qc`.


| Tool | Use | QC Type |
| ------------------------------------------------------------------------------------------------------------ | ------------------------------------------------- | ------------- |
| [fastqc](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) | Performs QC on raw sequence reads data (FASTQ) | FASTQ quality |
| [FastQ Screen](https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/) | Screens FASTQ for other-species contamination | FASTQ quality |
| [Picard's MarkDuplicates](https://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates) | Determines level of read duplication on BAM files | BAM quality |


!!! note "CGDS users only"

* At CGDS, these optional tools were run by our small_variant_caller_pipeline.

3 changes: 3 additions & 0 deletions docs/input_output.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@

## Input

<!-- markdown-link-check-disable -->

Samples belonging to a project are provided as input via `--pedigree` to QuaC in [pedigree file
format](https://gatk.broadinstitute.org/hc/en-us/articles/360035531972-PED-Pedigree-format). Only the samples that are
supplied in pedigree file will be processed by QuaC and all of these samples must belong to the same project.

<!-- markdown-link-check-enable -->

!!! note "CGDS users only"

Expand Down

0 comments on commit 7a123d6

Please sign in to comment.