Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Deeptools bampefragmentsize #3596

Merged
merged 33 commits into from
Jan 30, 2025

Conversation

niekwit
Copy link
Contributor

@niekwit niekwit commented Jan 17, 2025

QC

While the contributions guidelines are more extensive, please particularly ensure that:

  • test.py was updated to call any added or updated example rules in a Snakefile
  • input: and output: file paths in the rules can be chosen arbitrarily
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:)
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to
  • the meta.yaml contains a link to the documentation of the respective tool or command under url:
  • conda environments use a minimal amount of channels and packages, in recommended ordering

This is my first Snakemake wrapper. I hope to contribute more later.

Summary by CodeRabbit

  • New Features

    • Added support for calculating fragment sizes from paired-end BAM files using DeepTools.
    • Introduced a new workflow tool for generating fragment size histograms and raw length data.
    • Created a new Conda environment configuration for setting up the DeepTools package.
    • Added a new file specifying regions for analysis.
  • Documentation

    • Created comprehensive metadata for the DeepTools fragment size analysis tool, including usage instructions and output specifications.
  • Tests

    • Added test case for the new DeepTools fragment size analysis functionality.
    • Enhanced test coverage with multiple scenarios for the fragment size analysis tool.

Copy link
Contributor

coderabbitai bot commented Jan 17, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a new workflow for the DeepTools bamPEFragmentSize functionality, which calculates fragment sizes from paired-end sequencing BAM files. The implementation includes a comprehensive setup with an environment configuration, metadata description, a Snakemake rule, a Python wrapper script, and a corresponding test case. The workflow allows users to generate fragment size histograms and optionally output raw fragment length data, with support for customization through various parameters.

Changes

File Change Summary
bio/deeptools/bampefragmentsize/environment.yaml New conda environment file with conda-forge, bioconda, and nodefaults channels; specifies deeptools==3.5.5
bio/deeptools/bampefragmentsize/meta.yaml New metadata file describing bamPEFragmentSize tool, its inputs, outputs, and command-line options
bio/deeptools/bampefragmentsize/test/Snakefile Added deeptools_bampe_fragmentsize rule for processing BAM files with fragment size analysis
bio/deeptools/bampefragmentsize/wrapper.py New Python wrapper script to execute bamPEFragmentSize with configurable parameters
test_wrappers.py Added test function test_deeptools_bampe_fragmentsize(run) for the new DeepTools workflow
bio/deeptools/bampefragmentsize/environment.linux-64.pin.txt New file for specifying a reproducible conda environment for Linux 64-bit platform with explicit package URLs
bio/deeptools/bampefragmentsize/test/regions.bed Added two new region entries for testing purposes

Possibly related PRs

  • feat: Deeptools multibigwig summary #3135: The changes in this PR introduce a new environment.yaml file that specifies the same dependency on deeptools version 3.5.5 as the main PR, indicating a direct connection in terms of environment setup for the DeepTools package.
  • feat: NGS-bits SampleSimilarity #3500: This PR introduces a wrapper for the ngs-bits SampleSimilarity tool, which includes a new environment.yaml file that specifies channels and dependencies, similar to the main PR's focus on setting up a conda environment for deeptools.
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
bio/deeptools/bampefragmentsize/test/Snakefile (1)

1-25: Enhance rule documentation for better usability.

The rule implementation is correct, but the documentation could be improved:

  1. Add more context about the blacklist file's purpose and format
  2. Document the expected format for labels (e.g., comma-separated list)
  3. Provide examples of commonly used extra parameters
 rule deeptools_bampe_fragmentsize:
     input:
         # Input BAM file(s)
         bams=["a.bam", "b.bam"],
-        # Optional blacklist file
+        # Optional blacklist file in BED format to exclude specific regions from analysis
         # blacklist="",
     output:
         # Please note that -o/hist/--histogram and --outRawFragmentLengths are exclusively defined via output files.
         # Usable output variables, their extensions and which option they implicitly call are listed here:
         # https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/deeptools/bamPEFragmentSize.html.
         # Required
         hist="results/histogram.png",
         # Optional output files
         raw="results/raw.tab",
     log:
         "logs/deeptools/bampe_fragmentsize.log",
     threads: 4
     params:
         # Labels can be changed to anything
         # If left empty, the sample name will be used 
         # (without path and .bam extension)
+        # Format: comma-separated list matching the number of input BAMs
+        # Example: "sample1,sample2" or "" for automatic labels
         labels="",  
+        # Additional parameters for deeptools bamPEFragmentSize
+        # Example: --maxFragmentLength 1000 --binSize 10
         extra="--logScale",
     wrapper:
         "master/bio/deeptools/bampefragmentsize"
bio/deeptools/bampefragmentsize/wrapper.py (2)

30-35: Enhance output format validation.

The format validation could be more robust by using a set for valid formats and providing a more descriptive error message.

 # Check output format
 out_format = out_file.split(".")[-1]
-if not out_format in ["png", "pdf", "svg", "eps", "plotly"]:
+VALID_FORMATS = {"png", "pdf", "svg", "eps", "plotly"}
+if out_format not in VALID_FORMATS:
     raise ValueError(
-        "Output format must be either 'png', 'pdf', 'svg', 'eps', or 'plotly'"
+        f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
     )
🧰 Tools
🪛 Ruff (0.8.2)

32-32: Test for membership should be not in

Convert to not in

(E713)


45-51: Add error handling to shell command.

The shell command should include error handling to provide better feedback when the command fails.

 shell(
+    "set -e; "  # Exit on error
     "bamPEFragmentSize "
     "--numberOfProcessors {snakemake.threads} "
     "-b {bam_files} "
     "-o {out_file} "
     "{blacklist} {optional_output} {extra} {log}"
 )
test_wrappers.py (1)

2780-2785: Expand test coverage with additional test cases.

The current test only covers basic functionality. Consider adding tests for:

  1. Multiple input BAM files with custom labels
  2. Using a blacklist file
  3. Different output formats (PDF, SVG)
  4. Various extra parameters
 def test_deeptools_bampe_fragmentsize(run):
+    # Test basic functionality
     run(
         "bio/deeptools/bampefragmentsize",
         ["snakemake", "--cores", "1", "results/histogram.png", "--use-conda", "-F"],
     )
+    # Test with multiple BAMs and custom labels
+    run(
+        "bio/deeptools/bampefragmentsize",
+        ["snakemake", "--cores", "1", "results/multi_bam.png", "--config", "labels='sample1,sample2'", "--use-conda", "-F"],
+    )
+    # Test with blacklist
+    run(
+        "bio/deeptools/bampefragmentsize",
+        ["snakemake", "--cores", "1", "results/blacklist.png", "--config", "blacklist='regions.bed'", "--use-conda", "-F"],
+    )
bio/deeptools/bampefragmentsize/meta.yaml (2)

8-22: Improve table formatting for better readability.

The table formatting could be improved by:

  1. Consistent spacing
  2. Fixing typo in "occurances"
  3. Removing empty lines
-    +----------------------------+-----------------------------+---------------------+-------------+
-    | bamPEFragmentSize          | Output format               | Name of output      | Recommended |
-    |                            |                             |                     |             |
-    |                            |                             | variable to be used | extension   |
-    +============================+=============================+=====================+=============+
-    | --histogram, -hist, -o     | gzipped matrix file         | hist                | ".png"      |
-    |                            |                             |                     |             |
-    |                            |                             | (required)          |             |
-    +----------------------------+-----------------------------+---------------------+-------------+
-    | --outRawFragmentLengths    | tab-separated table of      | raw                 | ".tab"      |
-    |                            |                             |                     |             |
-    |                            | fragments lengths and       |                     |             |
-    |                            |                             |                     |             |
-    |                            | occurances                  |                     |             |
-    +----------------------------+-----------------------------+---------------------+-------------+
+    +------------------------+-------------------------+-------------------+-----------+
+    | bamPEFragmentSize      | Output format           | Output variable   | Extension |
+    +========================+=========================+===================+===========+
+    | --histogram, -hist, -o | gzipped matrix file     | hist (required)  | ".png"    |
+    +------------------------+-------------------------+-------------------+-----------+
+    | --outRawFragmentLengths| tab-separated table of  | raw             | ".tab"    |
+    |                        | fragment lengths and    |                 |           |
+    |                        | occurrences            |                 |           |
+    +------------------------+-------------------------+-------------------+-----------+

32-34: Add newline at end of file.

The file is missing a newline at the end, which is a POSIX requirement.

 params:
   - title: Title of the plot
-  - extra: Optional parameters given to computeMatrix
+  - extra: Optional parameters given to bamPEFragmentSize
+
🧰 Tools
🪛 yamllint (1.35.1)

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 917ba7b and 664675f.

📒 Files selected for processing (5)
  • bio/deeptools/bampefragmentsize/environment.yaml (1 hunks)
  • bio/deeptools/bampefragmentsize/meta.yaml (1 hunks)
  • bio/deeptools/bampefragmentsize/test/Snakefile (1 hunks)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
  • test_wrappers.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • bio/deeptools/bampefragmentsize/environment.yaml
🧰 Additional context used
📓 Path-based instructions (2)
test_wrappers.py (1)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

9-9: Undefined name snakemake

(F821)


12-12: Undefined name snakemake

(F821)


13-13: Undefined name snakemake

(F821)


19-19: Undefined name snakemake

(F821)


28-28: Undefined name snakemake

(F821)


32-32: Test for membership should be not in

Convert to not in

(E713)


38-38: Undefined name snakemake

(F821)


43-43: Undefined name snakemake

(F821)

🪛 yamllint (1.35.1)
bio/deeptools/bampefragmentsize/meta.yaml

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary

bio/deeptools/bampefragmentsize/wrapper.py Outdated Show resolved Hide resolved
bio/deeptools/bampefragmentsize/meta.yaml Outdated Show resolved Hide resolved
@niekwit niekwit changed the title Deeptools bampefragmentsize feat: Deeptools bampefragmentsize Jan 20, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
bio/deeptools/bampefragmentsize/meta.yaml (2)

8-23: Fix typo and improve table formatting.

  1. Fix the typo "occurances" to "occurrences"
  2. Consider improving table readability by removing extra spaces
     | --outRawFragmentLengths    | tab-separated table of      | raw                 | ".tab"      |
     |                            |                             |                     |             |
     |                            | fragments lengths and       |                     |             |
     |                            |                             |                     |             |
-    |                            | occurances                  |                     |             |
+    |                            | occurrences                 |                     |             |

34-34: Add newline at end of file.

Add a newline character at the end of the file to comply with YAML standards.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 664675f and 1651866.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/meta.yaml (1 hunks)
🧰 Additional context used
🪛 yamllint (1.35.1)
bio/deeptools/bampefragmentsize/meta.yaml

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (1)
bio/deeptools/bampefragmentsize/meta.yaml (1)

1-7: LGTM! Well-documented tool metadata.

The metadata section is well-structured with accurate tool name, documentation links, and a clear description.

bio/deeptools/bampefragmentsize/meta.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
bio/deeptools/bampefragmentsize/meta.yaml (2)

8-22: Fix table formatting and spelling.

The table content is informative but needs some refinement:

  1. Fix the spelling of "occurances" to "occurrences"
  2. Normalize spacing in table cells for better readability
     | --outRawFragmentLengths    | tab-separated table of      | raw                 | ".tab"      |
     |                            |                             |                     |             |
     |                            | fragments lengths and       |                     |             |
     |                            |                             |                     |             |
-    |                            | occurances                  |                     |             |
+    |                            | occurrences                 |                     |             |

34-34: Add newline at end of file.

Add a newline character at the end of the file to comply with YAML best practices.

   - title: Title of the plot
-  - extra: Optional parameters given to bamPEFragmentSize
+  - extra: Optional parameters given to bamPEFragmentSize
+
🧰 Tools
🪛 yamllint (1.35.1)

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

bio/deeptools/bampefragmentsize/wrapper.py (6)

2-2: Update copyright year to current year.

The copyright year is set to 2025, which is in the future. Please update it to the current year (2024).

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024, Niek Wit"

11-16: Add input validation for BAM files.

While the input handling is correct, consider adding validation to ensure:

  1. At least one BAM file is provided
  2. All provided BAM files exist and are readable
 # Get input files
 bam_files = snakemake.input.get("bams", "")
+if not bam_files:
+    raise ValueError("At least one BAM file must be provided")
+
+# Ensure all BAM files exist and are readable
+for bam in bam_files:
+    if not os.path.isfile(bam):
+        raise ValueError(f"BAM file not found: {bam}")
+
 blacklist = snakemake.input.get("blacklist", "")
 if blacklist:
     blacklist = f"--blackListFileName {blacklist}"
🧰 Tools
🪛 Ruff (0.8.2)

12-12: Undefined name snakemake

(F821)


13-13: Undefined name snakemake

(F821)


17-26: Enhance sample labels handling with better error messages and edge case handling.

While the current implementation works, consider these improvements:

  1. Make the assertion error message more informative by including the actual counts
  2. Handle potential edge cases in label generation
 # Get/create sample labels (remove .bam extension and dir)
 # If no labels are provided, use the basename of the bam file
 sample_label = snakemake.params.get("labels", "")
 if not sample_label:
-    sample_label = [os.path.basename(bam.replace(".bam", "")) for bam in bam_files]
+    sample_label = [
+        os.path.splitext(os.path.basename(bam))[0] for bam in bam_files
+    ]
 
 # Check if the number of labels is equal to the number of bam files
-assert len(sample_label) == len(
-    bam_files
-), "Number of labels must be equal to the number of bam files"
+n_labels = len(sample_label)
+n_bams = len(bam_files)
+if n_labels != n_bams:
+    raise ValueError(
+        f"Number of labels ({n_labels}) must match number of BAM files ({n_bams})"
+    )
🧰 Tools
🪛 Ruff (0.8.2)

19-19: Undefined name snakemake

(F821)


31-35: Improve output format validation.

Consider these improvements:

  1. Use not in operator for better readability
  2. Make the format check case-insensitive to handle uppercase extensions
-out_format = out_file.split(".")[-1]
-if not out_format in ["png", "pdf", "svg", "eps", "plotly"]:
+out_format = out_file.split(".")[-1].lower()
+if out_format not in ["png", "pdf", "svg", "eps", "plotly"]:
     raise ValueError(
         "Output format must be either 'png', 'pdf', 'svg', 'eps', or 'plotly'"
     )
🧰 Tools
🪛 Ruff (0.8.2)

32-32: Test for membership should be not in

Convert to not in

(E713)


40-41: Use f-string for better readability.

Consider using an f-string instead of the format method for better readability.

 if out_raw:
-    optional_output = " --outRawFragmentLengths {out_tab} ".format(out_tab=out_raw)
+    optional_output = f" --outRawFragmentLengths {out_raw} "

46-52: Improve shell command construction.

Consider these improvements:

  1. Join command components more cleanly using a list
  2. Quote BAM files to handle spaces in filenames
-shell(
-    "bamPEFragmentSize "
-    "--numberOfProcessors {snakemake.threads} "
-    "-b {bam_files} "
-    "-o {out_file} "
-    "{blacklist} {optional_output} {extra} {log}"
-)
+# Quote BAM files to handle spaces in filenames
+quoted_bams = " ".join(f'"{bam}"' for bam in bam_files)
+
+cmd = [
+    "bamPEFragmentSize",
+    f"--numberOfProcessors {snakemake.threads}",
+    f"-b {quoted_bams}",
+    f"-o {out_file}",
+    blacklist,
+    optional_output,
+    extra,
+    log,
+]
+
+shell(" ".join(cmd))
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1651866 and f379c2b.

📒 Files selected for processing (3)
  • bio/deeptools/bampefragmentsize/environment.linux-64.pin.txt (1 hunks)
  • bio/deeptools/bampefragmentsize/meta.yaml (1 hunks)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • bio/deeptools/bampefragmentsize/environment.linux-64.pin.txt
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 yamllint (1.35.1)
bio/deeptools/bampefragmentsize/meta.yaml

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

9-9: Undefined name snakemake

(F821)


12-12: Undefined name snakemake

(F821)


13-13: Undefined name snakemake

(F821)


19-19: Undefined name snakemake

(F821)


28-28: Undefined name snakemake

(F821)


32-32: Test for membership should be not in

Convert to not in

(E713)


38-38: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (3)
bio/deeptools/bampefragmentsize/meta.yaml (2)

1-6: LGTM! Documentation is well-structured.

The tool identification and documentation section is properly formatted with correct tool name and helpful links to both the documentation and source code.


24-34: LGTM! Metadata structure is well-defined.

The configuration metadata follows Snakemake wrapper conventions with clear specifications for inputs, outputs, and parameters.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 34-34: no new line character at the end of file

(new-line-at-end-of-file)

bio/deeptools/bampefragmentsize/wrapper.py (1)

9-9: LGTM!

The logging setup follows Snakemake wrapper best practices.

🧰 Tools
🪛 Ruff (0.8.2)

9-9: Undefined name snakemake

(F821)

@niekwit
Copy link
Contributor Author

niekwit commented Jan 24, 2025

Hello, this is my first pull request to add a new wrapper. Is there anything else I need to do to get this added to the wrapper repository?

@fgvieira
Copy link
Collaborator

@niekwit just address/reply to the comments/suggestions. 😄

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
bio/deeptools/bampefragmentsize/meta.yaml (2)

3-6: Consider using markdown-style links for better compatibility.

The RST-style links (`documentation <url>`_) might not render correctly in YAML. Consider using markdown-style links instead.

 description: >
   ``deepTools bamPEFragmentSize`` calculates the fragment sizes for read pairs given a BAM file from paired-end sequencing.
-  For usage information about ``deepTools bamPEFragmentSize``, please see the `documentation <https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html>`_.
-  For more information about ``deepTools``, also see the `source code <https://github.com/deeptools/deepTools>`_.
+  For usage information about ``deepTools bamPEFragmentSize``, please see the [documentation](https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html).
+  For more information about ``deepTools``, also see the [source code](https://github.com/deeptools/deepTools).

8-16: Consider using YAML list format instead of ASCII art table.

While the ASCII art table is readable, it can be hard to maintain. Consider using a more YAML-native format for better maintainability.

-    +------------------------+-------------------------+-------------------+-----------+
-    | bamPEFragmentSize      | Output format           | Output variable   | Extension |
-    +========================+=========================+===================+===========+
-    | --histogram, -hist, -o | gzipped matrix file     | hist (required)  | ".png"    |
-    +------------------------+-------------------------+-------------------+-----------+
-    | --outRawFragmentLengths| tab-separated table of  | raw             | ".tab"    |
-    |                        | fragment lengths and    |                 |           |
-    |                        | occurrences            |                 |           |
-    +------------------------+-------------------------+-------------------+-----------+
+options:
+  histogram:
+    flags: ["--histogram", "-hist", "-o"]
+    format: "gzipped matrix file"
+    variable: "hist"
+    extension: ".png"
+    required: true
+  raw_lengths:
+    flag: "--outRawFragmentLengths"
+    format: "tab-separated table of fragment lengths and occurrences"
+    variable: "raw"
+    extension: ".tab"
+    required: false
bio/deeptools/bampefragmentsize/wrapper.py (5)

2-2: Update copyright year.

The copyright year should be the current year (2025) or the range from the year of creation to the current year (e.g., "Copyright 2024-2025").


14-15: Improve readability of blacklist command formatting.

The f-string could be more readable by using string formatting consistent with the shell command at the end of the file.

-    blacklist = f"--blackListFileName {blacklist}"
+    blacklist = "--blackListFileName {}".format(blacklist)

23-26: Enhance the assertion error message.

The error message could be more helpful by including the actual counts.

-assert len(sample_label) == len(
-    bam_files
-), "Number of labels must be equal to the number of bam files"
+assert len(sample_label) == len(bam_files), (
+    f"Number of labels ({len(sample_label)}) must be equal to "
+    f"the number of bam files ({len(bam_files)})"
+)

31-36: Improve output format validation.

Consider these enhancements:

  1. Use not in for more Pythonic code
  2. Make format check case-insensitive to handle inputs like "PNG" or "PDF"
-out_format = out_file.split(".")[-1]
+out_format = out_file.split(".")[-1].lower()
 VALID_FORMATS = {"png", "pdf", "svg", "eps", "plotly"}
-if not out_format in VALID_FORMATS:
+if out_format not in VALID_FORMATS:
     raise ValueError(
         f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
     )
🧰 Tools
🪛 Ruff (0.8.2)

33-33: Test for membership should be not in

Convert to not in

(E713)


47-54: Improve shell command readability and consistency.

Consider these enhancements:

  1. Split the long command for better readability
  2. Use consistent string formatting
-shell(
-    "set -e; "  # Exit on error
-    "bamPEFragmentSize "
-    "--numberOfProcessors {snakemake.threads} "
-    "-b {bam_files} "
-    "-o {out_file} "
-    "{blacklist} {optional_output} {extra} {log}"
-)
+shell(
+    " ".join([
+        "set -e;",  # Exit on error
+        "bamPEFragmentSize",
+        "--numberOfProcessors {}".format(snakemake.threads),
+        "-b {}".format(bam_files),
+        "-o {}".format(out_file),
+        "{}".format(blacklist),
+        "{}".format(optional_output),
+        "{}".format(extra),
+        "{}".format(log)
+    ])
+)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f379c2b and b966679.

📒 Files selected for processing (5)
  • bio/deeptools/bampefragmentsize/meta.yaml (1 hunks)
  • bio/deeptools/bampefragmentsize/test/Snakefile (1 hunks)
  • bio/deeptools/bampefragmentsize/test/regions.bed (1 hunks)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
  • test_wrappers.py (5 hunks)
✅ Files skipped from review due to trivial changes (1)
  • bio/deeptools/bampefragmentsize/test/regions.bed
🚧 Files skipped from review as they are similar to previous changes (2)
  • bio/deeptools/bampefragmentsize/test/Snakefile
  • test_wrappers.py
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 YAMLlint (1.35.1)
bio/deeptools/bampefragmentsize/meta.yaml

[error] 29-29: no new line character at the end of file

(new-line-at-end-of-file)


[error] 29-29: trailing spaces

(trailing-spaces)

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

9-9: Undefined name snakemake

(F821)


12-12: Undefined name snakemake

(F821)


13-13: Undefined name snakemake

(F821)


19-19: Undefined name snakemake

(F821)


28-28: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


39-39: Undefined name snakemake

(F821)


45-45: Undefined name snakemake

(F821)

🔇 Additional comments (4)
bio/deeptools/bampefragmentsize/meta.yaml (2)

1-2: LGTM! Header section is accurate.

The tool name and documentation URL are correctly specified.


20-28: Verify blacklist parameter usage.

The configuration looks good, but let's verify that the blacklist parameter is correctly handled in the wrapper implementation.

✅ Verification successful

Blacklist parameter implementation verified

The wrapper correctly handles the optional blacklist parameter by:

  • Safely retrieving it with a default empty value
  • Properly formatting it for the command line when provided
  • Correctly integrating it into the final command
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the blacklist parameter is correctly handled in the wrapper.

# Search for blacklist handling in the wrapper implementation
rg -A 5 "blacklist" bio/deeptools/bampefragmentsize/wrapper.py

# Search for blacklist usage in test files
rg -A 5 "blacklist" bio/deeptools/bampefragmentsize/test/

Length of output: 1285

bio/deeptools/bampefragmentsize/wrapper.py (2)

9-9: LGTM!

Good use of Snakemake's logging formatter to capture both stdout and stderr.

🧰 Tools
🪛 Ruff (0.8.2)

9-9: Undefined name snakemake

(F821)


20-21: Add validation for sample labels.

The current implementation doesn't validate the content of labels. Consider adding checks for:

  • Non-empty labels
  • No duplicate labels
  • No special characters that might cause shell escaping issues

bio/deeptools/bampefragmentsize/meta.yaml Outdated Show resolved Hide resolved
bio/deeptools/bampefragmentsize/wrapper.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
bio/deeptools/bampefragmentsize/meta.yaml (2)

3-5: Enhance the description with tool's purpose.

Consider adding a brief explanation of what the tool does before the documentation links. For example:

 description: >
+  bamPEFragmentSize calculates the fragment sizes for read pairs in paired-end sequencing BAM files.
+  It generates a histogram of fragment sizes and can output the raw fragment length data.
   For usage information about ``deepTools bamPEFragmentSize``, please see the [documentation](https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html).

29-31: Add newline at end of file.

The file is missing a newline character at the end.

   - title: Title of the plot
   - extra: Optional parameters given to bamPEFragmentSize
+
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b966679 and 618ef92.

📒 Files selected for processing (2)
  • bio/deeptools/bampefragmentsize/meta.yaml (1 hunks)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • bio/deeptools/bampefragmentsize/wrapper.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (2)
bio/deeptools/bampefragmentsize/meta.yaml (2)

21-28: LGTM! Input/output specifications are clear and well-documented.

The input and output sections properly document:

  • Required and optional inputs with their file extensions
  • Expected output files matching the options configuration

8-13: Fix inconsistency in histogram format and extension.

There's a mismatch between the specified format and file extension:

  • Format states "gzipped matrix file"
  • Extension is ".png"

Please clarify whether this is a plot (.png) or a gzipped matrix file (.gz).

Let's check the DeepTools documentation:

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
bio/deeptools/bampefragmentsize/meta.yaml (3)

3-5: Enhance the description with tool functionality.

The current description only provides links to external resources. Consider adding a brief explanation of what the tool does, such as:

 description: >
+  A tool that calculates fragment sizes for read pairs in paired-end sequencing BAM files.
+  It generates fragment size histograms and can output raw fragment length data.
   For usage information about ``deepTools bamPEFragmentSize``, please see the [documentation](https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html).
   For more information about ``deepTools``, also see the [source code](https://github.com/deeptools/deepTools).

24-25: Enhance input descriptions for clarity.

The input descriptions could be more specific about requirements and purposes:

-  - bams: List of BAM files (.bam)
-  - blacklist: Optional BED file with regions to skip (.bed)
+  - bams: List of paired-end BAM files (.bam)
+  - blacklist: Optional BED file with regions to exclude from analysis (.bed)

32-32: Remove extra newline at end of file.

There are two consecutive newlines at the end of the file. Keep only one newline.

 params:
   - title: Title of the plot
   - extra: Optional parameters given to bamPEFragmentSize
-
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 618ef92 and a716ed9.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/meta.yaml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary

bio/deeptools/bampefragmentsize/meta.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
bio/deeptools/bampefragmentsize/wrapper.py (4)

2-2: Update copyright year to current year.

The copyright year is set to 2025, which is in the future. Please update it to 2024.

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024, Niek Wit"

19-29: Enhance readability and error messaging.

Consider these improvements for better maintainability:

  1. Make the list comprehension more readable
  2. Provide a more helpful assertion message
-sample_label = [os.path.basename(bam.replace(".bam", "")) for bam in bam_files]
+sample_label = [Path(bam).stem for bam in bam_files]

 assert len(sample_label) == len(
     bam_files
-), "Number of labels must be equal to the number of bam files"
+), f"Number of labels ({len(sample_label)}) does not match number of BAM files ({len(bam_files)})"
🧰 Tools
🪛 Ruff (0.8.2)

21-21: Undefined name snakemake

(F821)


30-36: Improve format validation.

Two improvements needed:

  1. Use not in operator for better readability
  2. Enhance error message with the actual format found
-if not out_format in VALID_FORMATS:
+if out_format not in VALID_FORMATS:
     raise ValueError(
-        f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
+        f"Found output format '{out_format}', but expected one of: {', '.join(sorted(VALID_FORMATS))}"
     )
🧰 Tools
🪛 Ruff (0.8.2)

31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


46-52: Improve shell command consistency.

Consider using consistent string formatting and spacing throughout the shell command.

 shell(
-    "bamPEFragmentSize "
-    "--numberOfProcessors {snakemake.threads} "
-    "-b {bam_files} "
-    "-o {snakemake.output.hist} "
-    "{blacklist} {out_raw} {extra} {log}"
+    f"bamPEFragmentSize "
+    f"--numberOfProcessors {snakemake.threads} "
+    f"-b {' '.join(bam_files)} "
+    f"-o {snakemake.output.hist} "
+    f"{blacklist} {out_raw} {extra} {log}"
 )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe3ca35 and 13427fb.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


21-21: Undefined name snakemake

(F821)


31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


39-39: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (2)
bio/deeptools/bampefragmentsize/wrapper.py (2)

12-18: LGTM! Clean input handling implementation.

The input handling is well-implemented:

  • Required BAM files are accessed directly through Snakemake
  • Optional blacklist is handled cleanly with appropriate string formatting
🧰 Tools
🪛 Ruff (0.8.2)

14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


38-42: LGTM! Clean optional output handling.

The optional output handling is well-implemented with clean string formatting.

🧰 Tools
🪛 Ruff (0.8.2)

39-39: Undefined name snakemake

(F821)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
bio/deeptools/bampefragmentsize/wrapper.py (4)

2-2: Update the copyright year.

The copyright year is set to 2025, which is in the future. Please update it to the current year (2024).

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024, Niek Wit"

19-29: Enhance label handling for better readability and error messages.

Consider these improvements:

  1. Simplify the list comprehension using Path
  2. Make the assertion message more helpful by including the actual counts
-    sample_label = [os.path.basename(bam.replace(".bam", "")) for bam in bam_files]
+    sample_label = [Path(bam).stem for bam in bam_files]

-assert len(sample_label) == len(
-    bam_files
-), "Number of labels must be equal to the number of bam files"
+assert len(sample_label) == len(bam_files), (
+    f"Number of labels ({len(sample_label)}) must match "
+    f"number of BAM files ({len(bam_files)})"
+)
🧰 Tools
🪛 Ruff (0.8.2)

21-21: Undefined name snakemake

(F821)


30-36: Improve output format validation.

Two improvements needed:

  1. Use not in operator for better readability
  2. Include the dot in the format string for clarity
-if not out_format in VALID_FORMATS:
+if out_format not in VALID_FORMATS:
     raise ValueError(
-        f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
+        f"Invalid output format '.{out_format.lstrip('.')}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
     )
🧰 Tools
🪛 Ruff (0.8.2)

31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


46-52: Improve shell command formatting for consistency.

Consider using consistent formatting for the shell command parameters.

 shell(
     "bamPEFragmentSize "
     "--numberOfProcessors {snakemake.threads} "
     "-b {bam_files} "
     "-o {snakemake.output.hist} "
-    "{blacklist} {out_raw} {extra} {log}"
+    "{blacklist} "
+    "{out_raw} "
+    "{extra} "
+    "{log}"
 )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d79f758 and 959c4cf.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


21-21: Undefined name snakemake

(F821)


31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


39-39: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: testing
  • GitHub Check: docs
  • GitHub Check: Summary
🔇 Additional comments (2)
bio/deeptools/bampefragmentsize/wrapper.py (2)

12-18: LGTM! Input handling is well-implemented.

The code correctly handles both required BAM files and the optional blacklist file.

🧰 Tools
🪛 Ruff (0.8.2)

14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


38-45: LGTM! Optional output and parameter handling is well-implemented.

The code correctly handles both the optional raw output file and extra parameters.

🧰 Tools
🪛 Ruff (0.8.2)

39-39: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
bio/deeptools/bampefragmentsize/wrapper.py (3)

2-2: Update copyright year.

The copyright year should be the current year (2025) or the range from when the code was first written to the current year.

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024-2025, Niek Wit"

19-29: Enhance sample labels handling.

Consider these improvements for better readability and error messaging:

-sample_label = [os.path.basename(bam.replace(".bam", "")) for bam in bam_files]
+sample_label = [Path(bam).stem for bam in bam_files]

 assert len(sample_label) == len(
     bam_files
-), "Number of labels must be equal to the number of bam files"
+), f"Number of labels ({len(sample_label)}) must match number of BAM files ({len(bam_files)})"
🧰 Tools
🪛 Ruff (0.8.2)

21-21: Undefined name snakemake

(F821)


46-52: Improve shell command readability.

Consider using an f-string for better readability and maintainability:

-shell(
-    "bamPEFragmentSize "
-    "--numberOfProcessors {snakemake.threads} "
-    "-b {bam_files} "
-    "-o {snakemake.output.hist} "
-    "{blacklist} {out_raw} {extra} {log}"
-)
+shell(
+    f"bamPEFragmentSize "
+    f"--numberOfProcessors {snakemake.threads} "
+    f"-b {bam_files} "
+    f"-o {snakemake.output.hist} "
+    f"{blacklist} {out_raw} {extra} {log}"
+)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 959c4cf and ecb6727.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


21-21: Undefined name snakemake

(F821)


31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


39-39: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: testing
  • GitHub Check: docs
  • GitHub Check: Summary
🔇 Additional comments (2)
bio/deeptools/bampefragmentsize/wrapper.py (2)

12-18: LGTM! Input handling is well implemented.

The code correctly handles both required BAM files and optional blacklist input, with proper error handling delegated to Snakemake.

🧰 Tools
🪛 Ruff (0.8.2)

14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


38-45: LGTM! Optional outputs and parameters are well handled.

The code correctly handles optional raw output and extra parameters with proper defaults.

🧰 Tools
🪛 Ruff (0.8.2)

39-39: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

bio/deeptools/bampefragmentsize/wrapper.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bio/deeptools/bampefragmentsize/wrapper.py (1)

2-2: Update copyright year.

The copyright year should be the current year (2025) or the year when the code was written.

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024-2025, Niek Wit"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecb6727 and 678e7ec.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


21-21: Undefined name snakemake

(F821)


31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)


39-39: Undefined name snakemake

(F821)


44-44: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: docs
  • GitHub Check: formatting
  • GitHub Check: testing
  • GitHub Check: linting
  • GitHub Check: Summary
🔇 Additional comments (4)
bio/deeptools/bampefragmentsize/wrapper.py (4)

12-18: LGTM! Input handling is robust.

The code correctly handles both required BAM files and optional blacklist input.

🧰 Tools
🪛 Ruff (0.8.2)

14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


43-52: LGTM! Shell command construction is well-structured.

The command properly handles all required and optional parameters with appropriate spacing.

🧰 Tools
🪛 Ruff (0.8.2)

44-44: Undefined name snakemake

(F821)


19-28: 🛠️ Refactor suggestion

Improve sample label generation for BAM files.

The current string replacement approach could modify unintended parts of filenames and doesn't validate file extensions.

-    sample_label = [os.path.basename(bam.replace(".bam", "")) for bam in bam_files]
+    sample_label = [Path(bam).name.removesuffix(".bam") for bam in bam_files if Path(bam).suffix == ".bam"]

This change:

  1. Safely removes the .bam suffix only from the end
  2. Validates that files are actually BAM files
  3. Uses Path consistently with the rest of the code

Likely invalid or redundant comment.

🧰 Tools
🪛 Ruff (0.8.2)

21-21: Undefined name snakemake

(F821)


30-36: 🛠️ Refactor suggestion

Improve output format validation.

The current implementation needs improvements in operator usage and error handling.

-out_format = Path(snakemake.output.get("hist")).suffix
+out_format = Path(snakemake.output.hist).suffix  # Let Snakemake handle missing required output
 VALID_FORMATS = {".png", ".pdf", ".svg", ".eps", ".plotly"}
-if not out_format in VALID_FORMATS:
+if out_format not in VALID_FORMATS:
     raise ValueError(
         f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
     )

Likely invalid or redundant comment.

🧰 Tools
🪛 Ruff (0.8.2)

31-31: Undefined name snakemake

(F821)


33-33: Test for membership should be not in

Convert to not in

(E713)

bio/deeptools/bampefragmentsize/wrapper.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
bio/deeptools/bampefragmentsize/wrapper.py (2)

2-2: Update copyright year.

The copyright year 2025 is in the future. Consider updating it to the current year.

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024, Niek Wit"

6-6: Remove unused import.

The os module is imported but never used in the code.

-import os
🧰 Tools
🪛 Ruff (0.8.2)

6-6: os imported but unused

Remove unused import: os

(F401)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 678e7ec and 552a8d4.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

6-6: os imported but unused

Remove unused import: os

(F401)


10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


21-21: Undefined name snakemake

(F821)


35-35: Undefined name snakemake

(F821)


37-37: Test for membership should be not in

Convert to not in

(E713)


43-43: Undefined name snakemake

(F821)


48-48: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (5)
bio/deeptools/bampefragmentsize/wrapper.py (5)

10-18: LGTM! Input handling is well-implemented.

The code correctly handles both required BAM files and optional blacklist input, with proper error handling delegated to Snakemake.

🧰 Tools
🪛 Ruff (0.8.2)

10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


19-33: LGTM! Sample labels handling is robust and well-validated.

The implementation:

  • Gracefully handles optional labels
  • Safely extracts filenames using Path
  • Validates BAM file extensions
  • Ensures label count matches BAM count
🧰 Tools
🪛 Ruff (0.8.2)

21-21: Undefined name snakemake

(F821)


37-37: Use 'not in' operator for better readability.

Replace not out_format in with out_format not in for better Python idioms.

-if not out_format in VALID_FORMATS:
+if out_format not in VALID_FORMATS:
🧰 Tools
🪛 Ruff (0.8.2)

37-37: Test for membership should be not in

Convert to not in

(E713)


45-45: Remove extra spaces in command-line argument.

Extra spaces in the f-string could cause command-line parsing issues.

-    out_raw = f" --outRawFragmentLengths {out_raw} "
+    out_raw = f"--outRawFragmentLengths {out_raw}"

50-56: LGTM! Shell command construction is clean and complete.

The command properly includes all necessary components:

  • Number of processors
  • Input BAM files
  • Output histogram
  • Optional blacklist and raw output
  • Extra parameters
  • Logging

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
bio/deeptools/bampefragmentsize/wrapper.py (3)

2-2: Update copyright year.

The copyright year should be the current year or the year when the code was written.

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024, Niek Wit"

6-6: Remove unused import.

The os module is imported but never used in the code.

-import os
🧰 Tools
🪛 Ruff (0.8.2)

6-6: os imported but unused

Remove unused import: os

(F401)


37-37: Improve readability of membership test.

Use not in operator for better readability when testing for membership.

-if not out_format in VALID_FORMATS:
+if out_format not in VALID_FORMATS:
🧰 Tools
🪛 Ruff (0.8.2)

37-37: Test for membership should be not in

Convert to not in

(E713)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 552a8d4 and 00becbc.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

6-6: os imported but unused

Remove unused import: os

(F401)


10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


21-21: Undefined name snakemake

(F821)


35-35: Undefined name snakemake

(F821)


37-37: Test for membership should be not in

Convert to not in

(E713)


43-43: Undefined name snakemake

(F821)


48-48: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: docs
  • GitHub Check: testing
  • GitHub Check: Summary
🔇 Additional comments (4)
bio/deeptools/bampefragmentsize/wrapper.py (4)

10-18: LGTM! Input handling is well-implemented.

The code correctly handles both required BAM files and the optional blacklist file, with proper error handling delegated to Snakemake.

🧰 Tools
🪛 Ruff (0.8.2)

10-10: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


15-15: Undefined name snakemake

(F821)


19-32: LGTM! Sample labels handling is robust.

The code effectively handles both user-provided labels and auto-generated labels from BAM filenames, with proper validation to ensure label count matches BAM file count.

🧰 Tools
🪛 Ruff (0.8.2)

21-21: Undefined name snakemake

(F821)


42-49: LGTM! Optional output and parameters handling is clean.

The code effectively handles both optional raw output and extra parameters.

🧰 Tools
🪛 Ruff (0.8.2)

43-43: Undefined name snakemake

(F821)


48-48: Undefined name snakemake

(F821)


50-56: LGTM! Shell command construction is well-formatted.

The command is properly constructed with all necessary parameters and appropriate spacing.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
bio/deeptools/bampefragmentsize/wrapper.py (1)

2-2: Update copyright year.

The copyright year should be the current year (2025) or the year when the code was written.

-__copyright__ = "Copyright 2025, Niek Wit"
+__copyright__ = "Copyright 2024-2025, Niek Wit"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00becbc and e0b4536.

📒 Files selected for processing (1)
  • bio/deeptools/bampefragmentsize/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
bio/deeptools/bampefragmentsize/wrapper.py (2)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.


Pattern **/wrapper.py: Do not complain about use of undefined variable called snakemake.

🪛 Ruff (0.8.2)
bio/deeptools/bampefragmentsize/wrapper.py

9-9: Undefined name snakemake

(F821)


13-13: Undefined name snakemake

(F821)


14-14: Undefined name snakemake

(F821)


20-20: Undefined name snakemake

(F821)


30-30: Undefined name snakemake

(F821)


38-38: Undefined name snakemake

(F821)


43-43: Undefined name snakemake

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: docs
  • GitHub Check: testing
  • GitHub Check: Summary
🔇 Additional comments (2)
bio/deeptools/bampefragmentsize/wrapper.py (2)

14-16: Validate blacklist file.

Consider validating the blacklist file when provided:

  1. Check if the file exists
  2. Validate file extension (typically .bed or .blacklist)
 blacklist = snakemake.input.get("blacklist", "")
 if blacklist:
+    blacklist_path = Path(blacklist)
+    if not blacklist_path.exists():
+        raise ValueError(f"Blacklist file not found: {blacklist}")
+    if blacklist_path.suffix not in {".bed", ".blacklist"}:
+        raise ValueError(f"Invalid blacklist file extension: {blacklist_path.suffix}")
     blacklist = f"--blackListFileName {blacklist}"
🧰 Tools
🪛 Ruff (0.8.2)

14-14: Undefined name snakemake

(F821)


45-51: Add input validation before shell command execution.

Consider adding these validations:

  1. Ensure number of threads is positive
  2. Verify BAM files list is not empty
+# Validate inputs before shell command
+if not bam_files:
+    raise ValueError("No BAM files provided")
+if snakemake.threads < 1:
+    raise ValueError(f"Number of threads must be positive, got: {snakemake.threads}")
+
 shell(
     "bamPEFragmentSize "
     "--numberOfProcessors {snakemake.threads} "

@fgvieira fgvieira merged commit 6e90858 into snakemake:master Jan 30, 2025
7 checks passed
@fgvieira
Copy link
Collaborator

thanks for your contribution!! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants