Skip to content

Commit

Permalink
feat: NGS-bits SampleAncestry (#3502)
Browse files Browse the repository at this point in the history
<!-- Ensure that the PR title follows conventional commit style (<type>:
<description>)-->
<!-- Possible types are here:
https://github.com/commitizen/conventional-commit-types/blob/master/index.json
-->

<!-- Add a description of your PR here-->
This PR adds a second tool from
[ngs-bits](https://github.com/imgag/ngs-bits/blob/master/doc/tools/SampleAncestry/index.md).

### QC
<!-- Make sure that you can tick the boxes below. -->

* [X] I confirm that I have followed the [documentation for contributing
to
`snakemake-wrappers`](https://snakemake-wrappers.readthedocs.io/en/stable/contributing.html).

While the contributions guidelines are more extensive, please
particularly ensure that:
* [X] `test.py` was updated to call any added or updated example rules
in a `Snakefile`
* [X] `input:` and `output:` file paths in the rules can be chosen
arbitrarily
* [X] wherever possible, command line arguments are inferred and set
automatically (e.g. based on file extensions in `input:` or `output:`)
* [X] temporary files are either written to a unique hidden folder in
the working directory, or (better) stored where the Python function
`tempfile.gettempdir()` points to
* [X] the `meta.yaml` contains a link to the documentation of the
respective tool or command under `url:`
* [X] conda environments use a minimal amount of channels and packages,
in recommended ordering


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced `environment.linux-64.pin.txt` and `environment.yaml` files
for easy environment setup on Linux.
- Added a new tool, `NGS-bits SampleAncestry`, to estimate sample
ancestry based on genetic variants.
- Implemented a Snakemake wrapper for executing the `SampleAncestry`
command.
- Created a sample VCF file for testing the ancestry estimation
functionality.

- **Bug Fixes**
- Enhanced test coverage with new test functions for `SampleAncestry`
and `autoindex` workflows.

- **Documentation**
- Added comprehensive metadata for the `NGS-bits SampleAncestry` tool,
detailing its functionality and requirements.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: tdayris <tdayris@gustaveroussy.fr>
Co-authored-by: tdayris <thibault.dayris@gustaveroussy.fr>
Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: Felix Mölder <felix.moelder@uni-due.de>
Co-authored-by: Christopher Schröder <christopher.schroeder@tu-dortmund.de>
Co-authored-by: Filipe G. Vieira <1151762+fgvieira@users.noreply.github.com>
  • Loading branch information
9 people authored Dec 2, 2024
1 parent f46427c commit 8600d44
Show file tree
Hide file tree
Showing 7 changed files with 248 additions and 0 deletions.
166 changes: 166 additions & 0 deletions bio/ngsbits/sampleancestry/environment.linux-64.pin.txt

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions bio/ngsbits/sampleancestry/environment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- ngs-bits=2024_11
13 changes: 13 additions & 0 deletions bio/ngsbits/sampleancestry/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: NGS-bits SampleAncestry
url: https://github.com/imgag/ngs-bits/blob/master/doc/tools/SampleAncestry/index.md
description: Estimates the ancestry of a sample based on variants.
authors:
- Thibault Dayris
input:
- Path to one or multiple VCF file(s).
output:
- Path to results table (TSV)
params:
- extra: Optional parameters besides IO
notes: |
To estimate ancestry, the input VCF file must have enough variants AND variants that overlaps known human variants ancestry.
13 changes: 13 additions & 0 deletions bio/ngsbits/sampleancestry/test/Snakefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
rule test_ngsbits_sampleancestry:
input:
# Either a single VCF or a list of VCF files
"sample.vcf",
output:
"ancestry.tsv",
threads: 1
log:
"ancestry.log",
params:
extra="-min_snps 4 -build hg19",
wrapper:
"master/bio/ngsbits/sampleancestry"
11 changes: 11 additions & 0 deletions bio/ngsbits/sampleancestry/test/sample.vcf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
##fileformat=VCFv4.0
##INFO=<ID=AF_AFR,Number=1,Type=String,Description="no description available">
##INFO=<ID=AF_EAS,Number=1,Type=String,Description="no description available">
##INFO=<ID=AF_EUR,Number=1,Type=String,Description="no description available">
##INFO=<ID=AF_SAS,Number=1,Type=String,Description="no description available">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
chr1 69270 . A G . . AF_AFR=0.360;AF_EAS=0.998;AF_EUR=0.911;AF_SAS=0.902 GT 1/1
chr1 69897 . T C . . AF_AFR=0.312;AF_EAS=0.777;AF_EUR=0.844;AF_SAS=0.805 GT 0/1
chr1 325155 . C A . . AF_AFR=0.653;AF_EAS=0.971;AF_EUR=0.769;AF_SAS=0.497 GT 1/1
chr1 881627 . G A . . AF_AFR=0.130;AF_EAS=0.658;AF_EUR=0.636;AF_SAS=0.565 GT 0/1
chr1 914852 . G C . . AF_AFR=0.249;AF_EAS=0.722;AF_EUR=0.596;AF_SAS=0.622 GT 1/1
23 changes: 23 additions & 0 deletions bio/ngsbits/sampleancestry/wrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# coding: utf-8

"""Snakemake wrapper for NGS-bits SampleAncestry"""

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2024, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake.shell import shell

log = snakemake.log_fmt_shell(
stdout=False,
stderr=True,
)
extra = snakemake.params.get("extra", "")

shell(
"SampleAncestry {extra}"
" -in {snakemake.input}"
" -out {snakemake.output:q}"
" {log}"
)
16 changes: 16 additions & 0 deletions test_wrappers.py
Original file line number Diff line number Diff line change
Expand Up @@ -3579,6 +3579,20 @@ def test_nanosim_metagenome(run):
)


def test_ngsbits_sampleancestry(run):
run(
"bio/ngsbits/sampleancestry",
[
"snakemake",
"--cores",
"1",
"--use-conda",
"-F",
"ancestry.tsv",
],
)


def test_ngsderive(run):
run(
"bio/ngsderive",
Expand Down Expand Up @@ -5959,12 +5973,14 @@ def test_vg_autoindex_giraffe(run):
["snakemake", "--cores", "1", "resources/genome.dist", "--use-conda", "-F"],
)


def test_vg_autoindex_map(run):
run(
"bio/vg/autoindex",
["snakemake", "--cores", "1", "resources/genome.xg", "--use-conda", "-F"],
)


def test_vg_construct(run):
run(
"bio/vg/construct",
Expand Down

0 comments on commit 8600d44

Please sign in to comment.