-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for multiple input file subcommands in seqkit wrapper (subcommands "common" and "concat") #3142
feat: add support for multiple input file subcommands in seqkit wrapper (subcommands "common" and "concat") #3142
Conversation
…er (subcommands "common" and "concat")
WalkthroughThe changes introduce two new rules, Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
@fgvieira From what I could see, these two commands were not yet supported, as multiple fasta input files were not yet handled. I think this is the minimal amount of changes to accommodate them, but if you have any better suggestions, they are always welcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (4)
- bio/seqkit/test/Snakefile (1 hunks)
- bio/seqkit/test/data/b.fa (1 hunks)
- bio/seqkit/wrapper.py (2 hunks)
- test.py (1 hunks)
Files skipped from review due to trivial changes (1)
- bio/seqkit/test/data/b.fa
Additional context used
Path-based instructions (2)
bio/seqkit/wrapper.py (2)
Pattern
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Pattern
**/wrapper.py
: Do not complain about use of undefined variable calledsnakemake
.test.py (1)
Pattern
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Ruff
bio/seqkit/wrapper.py
12-12: Undefined name
snakemake
(F821)
13-13: Undefined name
snakemake
(F821)
15-15: Undefined name
snakemake
(F821)
Additional comments not posted (6)
bio/seqkit/wrapper.py (2)
11-15
: Approved: Handling of multiple input files.The logic for handling multiple input files based on the command (
concat
orcommon
) is correctly implemented. This change allows the script to adapt to different requirements of the seqkit subcommands.Tools
Ruff
12-12: Undefined name
snakemake
(F821)
13-13: Undefined name
snakemake
(F821)
15-15: Undefined name
snakemake
(F821)
47-47
: Ensure correct command construction.The final command string construction uses the dynamically defined
input
variable. Ensure that this variable is correctly integrated into the command string, especially in the context of different seqkit subcommands.Verification successful
Correct Integration of
input
Variable in Command StringThe
input
variable is correctly integrated into the command string withinbio/seqkit/wrapper.py
. It is constructed by joining multiple input files or using a single file, and it is appropriately included in the command string. The command construction is consistent with the expected usage for different seqkit subcommands.
input
is defined as" ".join(snakemake.input)
orsnakemake.input[0]
.- The command string includes
" {input}"
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the correct integration of the `input` variable in the command string. # Test: Search for the command string construction. Expect: Correct usage of the `input` variable. rg --type python -A 5 $'seqkit {snakemake.params.command}'Length of output: 183
Script:
#!/bin/bash # Description: Verify the correct integration of the `input` variable in the command string. # Test: Search for the command string construction in the bio/seqkit/wrapper.py file. rg --type python -A 5 'seqkit \{snakemake.params.command\}' bio/seqkit/wrapper.pyLength of output: 114
Script:
#!/bin/bash # Description: Verify the correct integration of the `input` variable in the command string. # Test: Search for the command string construction in the bio/seqkit/wrapper.py file. rg -A 5 'seqkit \{snakemake.params.command\}' bio/seqkit/wrapper.pyLength of output: 244
Script:
#!/bin/bash # Description: Search for the definition and usage of the `input` variable in the bio/seqkit/wrapper.py file. # Test: Identify how the `input` variable is defined and used. rg 'input' bio/seqkit/wrapper.pyLength of output: 268
bio/seqkit/test/Snakefile (2)
158-173
: Approved: New ruleseqkit_common
.The
seqkit_common
rule is well-defined with clear inputs, outputs, and parameters. This rule enhances the workflow by allowing the extraction of common sequences between two FASTA files.
176-191
: Approved: New ruleseqkit_concat
.The
seqkit_concat
rule is correctly implemented to handle the concatenation of sequences from two FASTA files. This addition broadens the capabilities of the bioinformatics pipeline.test.py (2)
432-444
: Test function forseqkit common
is correctly implemented.The function
test_seqkit_common
is well-structured and follows the conventions used throughout the test suite.The code changes are approved.
447-459
: Test function forseqkit concat
is correctly implemented.The function
test_seqkit_concat
is consistent with the other test functions and correctly sets up the test environment and parameters for theseqkit concat
functionality.The code changes are approved.
🤖 I have created a release \*beep\* \*boop\* --- ## [4.3.0](https://www.github.com/snakemake/snakemake-wrappers/compare/v4.2.0...v4.3.0) (2024-08-28) ### Features * add support for multiple input file subcommands in seqkit wrapper (subcommands "common" and "concat") ([#3142](https://www.github.com/snakemake/snakemake-wrappers/issues/3142)) ([3b5391f](https://www.github.com/snakemake/snakemake-wrappers/commit/3b5391f619b38334829c06b8bd0526a16e19c732)) * Deeptools multibigwig summary ([#3135](https://www.github.com/snakemake/snakemake-wrappers/issues/3135)) ([df7e2bf](https://www.github.com/snakemake/snakemake-wrappers/commit/df7e2bffdd61690e56380bb1b49ca663e58a477c)) * Deeptools plot correlation ([#3137](https://www.github.com/snakemake/snakemake-wrappers/issues/3137)) ([a965bd6](https://www.github.com/snakemake/snakemake-wrappers/commit/a965bd62f13bb62722daf08201a00b1f26bef38d)) * Deeptools plot pca ([#3138](https://www.github.com/snakemake/snakemake-wrappers/issues/3138)) ([0d9862b](https://www.github.com/snakemake/snakemake-wrappers/commit/0d9862b0f91e74bb90993eb7ecb938dec80d779b)) * Rseqc bamstat ([#3139](https://www.github.com/snakemake/snakemake-wrappers/issues/3139)) ([b4267e6](https://www.github.com/snakemake/snakemake-wrappers/commit/b4267e6a0244071a96efc8a91fd6ba982a738cb5)) * Rseqc inner distance ([#3140](https://www.github.com/snakemake/snakemake-wrappers/issues/3140)) ([8ca10f3](https://www.github.com/snakemake/snakemake-wrappers/commit/8ca10f3949ca6fb1ed9f9d046c89ca10a7c32c8c)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
QC
snakemake-wrappers
.While the contributions guidelines are more extensive, please particularly ensure that:
test.py
was updated to call any added or updated example rules in aSnakefile
input:
andoutput:
file paths in the rules can be chosen arbitrarilyinput:
oroutput:
)tempfile.gettempdir()
points tometa.yaml
contains a link to the documentation of the respective tool or command underurl:
Summary by CodeRabbit
New Features
seqkit_common
for extracting common sequences andseqkit_concat
for concatenating sequences.b.fa
) containing two nucleotide sequences for analysis.Tests
seqkit_common
andseqkit_concat
rules.These updates enhance the bioinformatics pipeline's capabilities and improve testing coverage.