-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: NGS-bits SampleAncestry #3502
Conversation
* perf: update bio/bcftools/index/environment.yaml. * perf: update bio/bcftools/index/environment.yaml. * perf: update bio/bcftools/index/environment.yaml.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
* Add autobump action * fix paths * dbg * dbg branch * add checkout * dbg * trigger rerun * entity regex and add label * dbg * Update autobump.yml * Update autobump.yml
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com>
📝 Walkthrough## Walkthrough
The changes in this pull request introduce several new files related to the `NGS-bits SampleAncestry` tool. A Conda environment specification is provided through `environment.linux-64.pin.txt` and `environment.yaml`, detailing package dependencies and channels. The `meta.yaml` file contains metadata about the tool's functionality and requirements. Additionally, a new Snakefile rule for testing and a sample VCF file for input are created, along with a wrapper script to facilitate command execution. Test functions for the workflow are also added to enhance testing coverage.
## Changes
| File Path | Change Summary |
|---------------------------------------------------|-----------------------------------------------------------------------------------------------------|
| `bio/ngsbits/sampleancestry/environment.linux-64.pin.txt` | New file created for Conda environment specifications, listing package URLs and hashes. |
| `bio/ngsbits/sampleancestry/environment.yaml` | New file created specifying Conda channels and dependencies, including `ngs-bits` version `2024_11`. |
| `bio/ngsbits/sampleancestry/meta.yaml` | New metadata file created detailing tool functionality, authorship, input/output specifications, and notes. |
| `bio/ngsbits/sampleancestry/test/Snakefile` | New rule `test_ngsbits_sampleancestry` added for processing VCF files and generating output TSV. |
| `bio/ngsbits/sampleancestry/test/sample.vcf` | New VCF file created with metadata and variant entries for testing purposes. |
| `bio/ngsbits/sampleancestry/wrapper.py` | New Snakemake wrapper script created for executing `SampleAncestry` command with logging and parameters. |
| `test_wrappers.py` | New test functions `test_ngsbits_sampleancestry` and `test_vg_autoindex_map` added for workflow validation. |
## Possibly related PRs
- **#3135**: Introduces a new `environment.linux-64.pin.txt` file for Conda, which is similar to the changes made in this PR that also introduces a new `environment.linux-64.pin.txt` for Deeptools.
- **#3162**: This PR also introduces a new `environment.linux-64.pin.txt` file for the Bwameth tool, aligning with the main PR's focus on creating a Conda environment specification.
- **#3165**: The introduction of a new `environment.linux-64.pin.txt` file for the Nanosim tool parallels the main PR's addition of a similar file.
- **#3302**: Similar to the main PR, this PR introduces a new `environment.linux-64.pin.txt` file for the vg giraffe tool, indicating a consistent approach to managing Conda environments across different projects.
- **#3372**: This PR updates the `meta.yaml` and Snakefile for the Merqury tool, reflecting ongoing enhancements in the bioinformatics toolset.
- **#3452**: The addition of a new `environment.linux-64.pin.txt` file for the vg autoindex giraffe tool aligns with the main PR's focus on creating environment specifications.
- **#3496**: The introduction of a new `environment.linux-64.pin.txt` file for the MTNucRatioCalculator tool parallels the main PR's focus on environment specifications.
- **#3497**: The addition of a new `environment.linux-64.pin.txt` file for the Sex.detERRmine tool aligns with the main PR's focus on creating environment specifications.
## Suggested reviewers
- johanneskoester Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (2)
bio/ngsbits/sampleancestry/test/Snakefile (1)
1-13
: Add rule documentationConsider adding a docstring to describe:
- Purpose of the rule
- Expected input format
- Output format and contents
- Parameter descriptions and valid ranges
rule test_ngsbits_sampleancestry: + """ + Analyze sample ancestry from VCF file(s) using NGS-bits SampleAncestry. + + Input: + VCF file(s) containing variant information + Output: + TSV file containing ancestry analysis results + Parameters: + -min_snps: Minimum number of SNPs required (default: 4) + -build: Reference genome build (default: hg19) + """ input:bio/ngsbits/sampleancestry/wrapper.py (1)
10-16
: Consider adding input validationWhile the setup is correct, consider adding validation for required Snakemake inputs before executing the shell command. This would provide clearer error messages if required files are missing.
Example validation:
# Validate required inputs if not snakemake.input: raise ValueError("No input file specified")🧰 Tools
🪛 Ruff (0.8.0)
12-12: Undefined name
snakemake
(F821)
16-16: Undefined name
snakemake
(F821)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
bio/ngsbits/sampleancestry/environment.linux-64.pin.txt
(1 hunks)bio/ngsbits/sampleancestry/environment.yaml
(1 hunks)bio/ngsbits/sampleancestry/meta.yaml
(1 hunks)bio/ngsbits/sampleancestry/test/Snakefile
(1 hunks)bio/ngsbits/sampleancestry/test/sample.vcf
(1 hunks)bio/ngsbits/sampleancestry/wrapper.py
(1 hunks)test_wrappers.py
(2 hunks)
✅ Files skipped from review due to trivial changes (4)
- bio/ngsbits/sampleancestry/environment.linux-64.pin.txt
- bio/ngsbits/sampleancestry/environment.yaml
- bio/ngsbits/sampleancestry/meta.yaml
- bio/ngsbits/sampleancestry/test/sample.vcf
🧰 Additional context used
📓 Path-based instructions (2)
bio/ngsbits/sampleancestry/wrapper.py (2)
Pattern **/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self
argument of methods.
Do not suggest type annotation of the cls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return
statement.
Pattern **/wrapper.py
: Do not complain about use of undefined variable called snakemake
.
test_wrappers.py (1)
Pattern **/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self
argument of methods.
Do not suggest type annotation of the cls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return
statement.
🪛 Ruff (0.8.0)
bio/ngsbits/sampleancestry/wrapper.py
12-12: Undefined name snakemake
(F821)
16-16: Undefined name snakemake
(F821)
🔇 Additional comments (3)
bio/ngsbits/sampleancestry/wrapper.py (1)
1-9
: LGTM! Well-documented file header
The file header contains all necessary metadata and documentation.
test_wrappers.py (2)
3575-3587
: LGTM: Test function follows project standards
The test function for NGS-bits SampleAncestry is well-implemented:
- Uses consistent naming convention
- Follows the established test pattern
- Includes all required snakemake parameters
- Has appropriate output file verification
5963-5970
: LGTM: Good test separation for VG autoindex
The split into separate test functions for giraffe and map outputs improves test granularity and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
bio/ngsbits/sampleancestry/wrapper.py (1)
21-21
: Consider removing the :q modifierSince Snakemake handles input/output path validation, the
:q
modifier on the output path might be unnecessary.- " -out {snakemake.output:q}" + " -out {snakemake.output}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
bio/ngsbits/sampleancestry/wrapper.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/ngsbits/sampleancestry/wrapper.py (2)
Pattern **/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self
argument of methods.
Do not suggest type annotation of the cls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return
statement.
Pattern **/wrapper.py
: Do not complain about use of undefined variable called snakemake
.
📓 Learnings (1)
bio/ngsbits/sampleancestry/wrapper.py (2)
Learnt from: tdayris
PR: snakemake/snakemake-wrappers#3502
File: bio/ngsbits/sampleancestry/wrapper.py:18-23
Timestamp: 2024-11-26T14:59:03.678Z
Learning: In Snakemake wrapper scripts, Snakemake validates input and output paths, so explicit shell quoting is not necessary.
Learnt from: tdayris
PR: snakemake/snakemake-wrappers#3502
File: bio/ngsbits/sampleancestry/wrapper.py:1-23
Timestamp: 2024-11-26T15:01:13.202Z
Learning: The NGS-bits SampleAncestry wrapper in `bio/ngsbits/sampleancestry/` includes a test Snakefile, sample VCF files, and tests available in the `test/` directory.
🪛 Ruff (0.8.0)
bio/ngsbits/sampleancestry/wrapper.py
12-12: Undefined name snakemake
(F821)
16-16: Undefined name snakemake
(F821)
🔇 Additional comments (3)
bio/ngsbits/sampleancestry/wrapper.py (3)
1-9
: LGTM: Well-structured file header
The file header contains all necessary metadata including encoding, purpose, and author information.
10-17
: LGTM: Proper setup of logging and parameters
The implementation correctly:
- Sets up logging to capture stderr
- Handles optional extra parameters
- Note: The undefined 'snakemake' warnings from static analysis can be safely ignored as this is a Snakemake wrapper
🧰 Tools
🪛 Ruff (0.8.0)
12-12: Undefined name snakemake
(F821)
16-16: Undefined name snakemake
(F821)
18-23
: 🛠️ Refactor suggestion
Add command availability check
Consider adding a check to verify that the SampleAncestry
command is available before execution.
from snakemake.shell import shell
+from shutil import which
+
+if which("SampleAncestry") is None:
+ raise RuntimeError(
+ "SampleAncestry command not found. Please ensure NGS-bits is installed correctly."
+ )
log = snakemake.log_fmt_shell(
Likely invalid or redundant comment.
🤖 I have created a release \*beep\* \*boop\* --- ## [5.4.0](https://www.github.com/snakemake/snakemake-wrappers/compare/v5.3.0...v5.4.0) (2024-12-06) ### Features * NGS-bits SampleAncestry ([#3502](https://www.github.com/snakemake/snakemake-wrappers/issues/3502)) ([8600d44](https://www.github.com/snakemake/snakemake-wrappers/commit/8600d44e79ae4dafa181d3b06eed6e3db0c7a2df)) * NGS-bits SampleSimilarity ([#3500](https://www.github.com/snakemake/snakemake-wrappers/issues/3500)) ([710597c](https://www.github.com/snakemake/snakemake-wrappers/commit/710597cdc4e7f518d1fda2ec246bb6a7e0e29ba9)) * NGSCheckMate make pattern ([#3499](https://www.github.com/snakemake/snakemake-wrappers/issues/3499)) ([3b96cc1](https://www.github.com/snakemake/snakemake-wrappers/commit/3b96cc18b5a7ce8643fc3f8a492333d8f339e4c5)) * Sex.DetERRmine ([#3497](https://www.github.com/snakemake/snakemake-wrappers/issues/3497)) ([3919f2e](https://www.github.com/snakemake/snakemake-wrappers/commit/3919f2e4b6fae381cf92c921e6f17086819de345)) ### Performance Improvements * autobump bio/bbtools ([#3507](https://www.github.com/snakemake/snakemake-wrappers/issues/3507)) ([19d027d](https://www.github.com/snakemake/snakemake-wrappers/commit/19d027d176bfa5da7ec2d75b9222547b0fd2b919)) * autobump bio/busco ([#3506](https://www.github.com/snakemake/snakemake-wrappers/issues/3506)) ([aad4b56](https://www.github.com/snakemake/snakemake-wrappers/commit/aad4b56df0ca427a0a12dba67bcf0edef51d545b)) * autobump bio/busco ([#3519](https://www.github.com/snakemake/snakemake-wrappers/issues/3519)) ([6af2e11](https://www.github.com/snakemake/snakemake-wrappers/commit/6af2e11535a6407a0a60eaa18f7e15548e4a4a01)) * autobump bio/encode_fastq_downloader ([#3521](https://www.github.com/snakemake/snakemake-wrappers/issues/3521)) ([cbf06d2](https://www.github.com/snakemake/snakemake-wrappers/commit/cbf06d227d2cece579b1bdff413d09036e2d976f)) * autobump bio/freebayes ([#3509](https://www.github.com/snakemake/snakemake-wrappers/issues/3509)) ([12b8b3c](https://www.github.com/snakemake/snakemake-wrappers/commit/12b8b3ce9d5be65a2165b4ca3e0403935b950237)) * autobump bio/gatk3/baserecalibrator ([#3523](https://www.github.com/snakemake/snakemake-wrappers/issues/3523)) ([7a7518e](https://www.github.com/snakemake/snakemake-wrappers/commit/7a7518e63e0c6eac8d7cb935808b8692e4e688ff)) * autobump bio/gatk3/indelrealigner ([#3525](https://www.github.com/snakemake/snakemake-wrappers/issues/3525)) ([a0d913c](https://www.github.com/snakemake/snakemake-wrappers/commit/a0d913ce81ceb45aeff0eeee8e1e92e63bda786c)) * autobump bio/gatk3/printreads ([#3524](https://www.github.com/snakemake/snakemake-wrappers/issues/3524)) ([67af9a6](https://www.github.com/snakemake/snakemake-wrappers/commit/67af9a6899872a5e2e8cabc58572aa31d51a43cc)) * autobump bio/gatk3/realignertargetcreator ([#3522](https://www.github.com/snakemake/snakemake-wrappers/issues/3522)) ([5f8ffe7](https://www.github.com/snakemake/snakemake-wrappers/commit/5f8ffe7349ab24d55d45e1811519b2b9e9985068)) * autobump bio/hifiasm ([#3510](https://www.github.com/snakemake/snakemake-wrappers/issues/3510)) ([2b1b9f2](https://www.github.com/snakemake/snakemake-wrappers/commit/2b1b9f265231a3f3bc121dd9d4f34111b15d4486)) * autobump bio/mapdamage2 ([#3526](https://www.github.com/snakemake/snakemake-wrappers/issues/3526)) ([92da252](https://www.github.com/snakemake/snakemake-wrappers/commit/92da252bcafd05a0187b635b1938593aa4268c3b)) * autobump bio/mosdepth ([#3511](https://www.github.com/snakemake/snakemake-wrappers/issues/3511)) ([762b273](https://www.github.com/snakemake/snakemake-wrappers/commit/762b273800120519ffd2bc2f670ae93be6187cac)) * autobump bio/mtnucratio ([#3512](https://www.github.com/snakemake/snakemake-wrappers/issues/3512)) ([7f6a3b0](https://www.github.com/snakemake/snakemake-wrappers/commit/7f6a3b07cc2bae16fb30c097e783b70e87ad58f1)) * autobump bio/ngsbits/sampleancestry ([#3527](https://www.github.com/snakemake/snakemake-wrappers/issues/3527)) ([2abf38c](https://www.github.com/snakemake/snakemake-wrappers/commit/2abf38c30e541dd45563ee7ec3959ccf43802fab)) * autobump bio/ngsbits/samplesimilarity ([#3529](https://www.github.com/snakemake/snakemake-wrappers/issues/3529)) ([c91ce10](https://www.github.com/snakemake/snakemake-wrappers/commit/c91ce1075792f231d833a89298e88353c96973b1)) * autobump bio/ngscheckmate/makesnvpattern ([#3528](https://www.github.com/snakemake/snakemake-wrappers/issues/3528)) ([ff9a81d](https://www.github.com/snakemake/snakemake-wrappers/commit/ff9a81d923b8efc807f0054400d03c6277777cb8)) * autobump bio/reference/ensembl-mysql-table ([#3513](https://www.github.com/snakemake/snakemake-wrappers/issues/3513)) ([6b5c545](https://www.github.com/snakemake/snakemake-wrappers/commit/6b5c5454e86cfd393e5fa55b86566e60ef43dd5c)) * autobump bio/sexdeterrmine ([#3514](https://www.github.com/snakemake/snakemake-wrappers/issues/3514)) ([2b18309](https://www.github.com/snakemake/snakemake-wrappers/commit/2b183092fd31225462490d43df5916e678ea5f83)) * autobump bio/spades/metaspades ([#3530](https://www.github.com/snakemake/snakemake-wrappers/issues/3530)) ([070b9b6](https://www.github.com/snakemake/snakemake-wrappers/commit/070b9b62af3cc79f7a4fe1500963892631f8d752)) * autobump bio/varlociraptor/call-variants ([#3533](https://www.github.com/snakemake/snakemake-wrappers/issues/3533)) ([8c563f1](https://www.github.com/snakemake/snakemake-wrappers/commit/8c563f18f02f3f06760f2f25ad85383f9d776b2d)) * autobump bio/varlociraptor/control-fdr ([#3532](https://www.github.com/snakemake/snakemake-wrappers/issues/3532)) ([ce7d1b0](https://www.github.com/snakemake/snakemake-wrappers/commit/ce7d1b00bdf6a856ffe5ed6bac7258ca970cde4a)) * autobump bio/varlociraptor/estimate-alignment-properties ([#3531](https://www.github.com/snakemake/snakemake-wrappers/issues/3531)) ([0b5ac04](https://www.github.com/snakemake/snakemake-wrappers/commit/0b5ac04792bcd54984ea6b0e6af41efa33fba126)) * autobump bio/varlociraptor/preprocess-variants ([#3534](https://www.github.com/snakemake/snakemake-wrappers/issues/3534)) ([56a8933](https://www.github.com/snakemake/snakemake-wrappers/commit/56a8933de936e20e0068bd1d8cb6bbea3826f655)) * autobump bio/vep/annotate ([#3515](https://www.github.com/snakemake/snakemake-wrappers/issues/3515)) ([2609900](https://www.github.com/snakemake/snakemake-wrappers/commit/26099008485dbf5a8054f3284a1892dd1245ac8a)) * autobump bio/vep/cache ([#3516](https://www.github.com/snakemake/snakemake-wrappers/issues/3516)) ([f46427c](https://www.github.com/snakemake/snakemake-wrappers/commit/f46427c7951ad1a2ccd142b74698aac3009b2c66)) * autobump bio/vep/plugins ([#3535](https://www.github.com/snakemake/snakemake-wrappers/issues/3535)) ([9a6ccc3](https://www.github.com/snakemake/snakemake-wrappers/commit/9a6ccc34ce5db38c419076d7f707321bf69357dc)) * autobump bio/vg/giraffe ([#3517](https://www.github.com/snakemake/snakemake-wrappers/issues/3517)) ([6fffdd6](https://www.github.com/snakemake/snakemake-wrappers/commit/6fffdd6caec48f099b6140dd113a295f7202fa63)) * autobump utils/csvtk ([#3508](https://www.github.com/snakemake/snakemake-wrappers/issues/3508)) ([41e8545](https://www.github.com/snakemake/snakemake-wrappers/commit/41e8545b5663c91d6b13b5acd2c126a6b42a9a92)) * autobump utils/csvtk ([#3520](https://www.github.com/snakemake/snakemake-wrappers/issues/3520)) ([d4a52e5](https://www.github.com/snakemake/snakemake-wrappers/commit/d4a52e5e900aa16cad66ebbea1b22fb27d3709ea)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR adds a second tool from ngs-bits.
QC
snakemake-wrappers
.While the contributions guidelines are more extensive, please particularly ensure that:
test.py
was updated to call any added or updated example rules in aSnakefile
input:
andoutput:
file paths in the rules can be chosen arbitrarilyinput:
oroutput:
)tempfile.gettempdir()
points tometa.yaml
contains a link to the documentation of the respective tool or command underurl:
Summary by CodeRabbit
New Features
environment.linux-64.pin.txt
andenvironment.yaml
files for easy environment setup on Linux.NGS-bits SampleAncestry
, to estimate sample ancestry based on genetic variants.SampleAncestry
command.Bug Fixes
SampleAncestry
andautoindex
workflows.Documentation
NGS-bits SampleAncestry
tool, detailing its functionality and requirements.