Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve some Singularity issues. Updates. DSL2 Migration. #57

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

vinjana
Copy link
Member

@vinjana vinjana commented Mar 6, 2024

  • 1.3.0 (March, 2024)
    • Minor: Let Nextflow automatically create the cached Singularity image.

      NOTE: The cached image name was changed to Nextflow's default name.

    • Patch: Reuse to the simpler Dockerfile that is also used in the nf-seq-qc and nf-seq-convert workflows.
    • Patch: Bumped Dockerfile base image to miniconda3:4.12.0.
    • Patch: Bumped minimum Nextflow to 23.10.1. Version 22 uses singularity exec, while 23 uses singularity run, which impacts process isolation. This also Closes Fix Singularity container startup #52.
    • Patch: Added a Singularity.def, in case the automatic conversion by Nextflow does not work.
    • Patch: Mention Conda only for development in README.md. Otherwise, it should not be used.
    • Patch: Test-script now implements a simple backwards-compatibility test by comparing against old result files.
    • Patch: Renamed test/test1.sh to integration-tests/run.sh. Changed order of parameters.
    • Patch: Migrated to DSL2. Closes Migrate to DSL2 #56

Summary by CodeRabbit

  • Documentation
    • Updated README.md with comprehensive guidance on workflows, cautioning against Conda, and emphasizing Singularity usage.
  • New Features
    • Restructured workflow logic in main.nf for streamlined channel operations with paired and unpaired fastqs.
  • Refactor
    • Improved environment profile in test/test1.sh to use "singularity" and optimized file paths.
    • Streamlined Docker image caching in nextflow.config for better reproducibility.
    • Enhanced environment encapsulation in .circleci/config.yml for integration tests.
    • Updated exclusions and inclusions in .dockerignore and .gitignore for Docker image building.
  • Chores
    • Added a new Singularity.def file for converting Docker images to Singularity images.

Additionally, changed README and use Nextflow`s default singularity conversion mechanism and container name
@vinjana vinjana self-assigned this Mar 6, 2024
Copy link

coderabbitai bot commented Mar 6, 2024

Walkthrough

The updates focus on enhancing the workflow and environment for bioinformatics processing, emphasizing the transition from Conda to Singularity for better reproducibility and efficiency. Key changes include instructions for file conversion, guidance against Conda for production workflows, detailed steps for using and creating Singularity images, and updates to the workflow logic for streamlined processing. Additionally, fixes address issues with Singularity container startup and profile merging, ensuring smoother operations in LSF cluster environments.

Changes

Files Change Summary
README.md Updated instructions for BAM to FASTQ conversion, guidance on using Singularity over Conda, creating Singularity images, and running workflows with Singularity on LSF clusters. Clarified Nextflow's role in creating cached images.
main.nf Restructured workflow logic for simplified channel operations.
test/test1.sh Updated environmentProfile from Conda to Singularity, revised file paths, and added MD5 checksum comparison loop.
nextflow.config Adjusted Singularity profile to pull Docker images automatically, added ext.containerUrl and ext.taskEnvYaml for better environment encapsulation.

Assessment against linked issues

Objective Addressed Explanation
Fix Singularity container startup (#52)
Ensure process.container values are correctly merged when using lsf and singularity (#58)
Update workflow logic for efficiency and reproducibility
Transition from Conda to Singularity for production workflows
Address configuration order to correctly combine settings from lsf and singularity profiles While changes address the use of Singularity, there's no explicit mention of fixing the configuration order issue related to profile merging in #58.

The assessment indicates that most objectives from the linked issues have been addressed, with a notable exception regarding the specific configuration order issue for profile merging. This might require further clarification or direct inspection of the nextflow.config changes to confirm if the underlying problem has been resolved.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 3

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between eb97a4f and ecdd06b.
Files selected for processing (4)
  • README.md (4 hunks)
  • main.nf (6 hunks)
  • nextflow.config (1 hunks)
  • test/test1.sh (1 hunks)
Additional comments: 8
nextflow.config (1)
  • 64-72: The changes to the singularity profile in the nextflow.config file are well-implemented and align with the PR objectives of enhancing reproducibility and environment encapsulation. The configuration now automatically pulls the Docker image and stores it in the cache directory, which is a significant improvement for workflow execution. Additionally, the use of --no-home in singularity.runOptions is a good practice to avoid potential issues with environment setup from the user's home directory, thus ensuring a more reproducible workflow environment.
test/test1.sh (1)
  • 13-13: Changing the default value of environmentProfile from "conda" to "singularity" in test/test1.sh is consistent with the PR objectives and the broader shift towards using Singularity for environment management. This change ensures that tests run in an environment that matches the updated workflow recommendations, which is crucial for maintaining consistency and reliability in testing.
main.nf (4)
  • 179-182: The modification to the bamToFastq process to directly output tuples is a significant improvement in terms of workflow clarity and efficiency. This change simplifies the data flow between processes and reduces the complexity of channel operations, aligning with the PR objectives of enhancing the project's structure and maintainability.
  • 211-214: The changes to the nameSortUnpairedFastqs process to accept tuples directly from the bamToFastq process are well-implemented. This adjustment further streamlines the workflow by eliminating unnecessary channel creation steps, contributing to the overall goal of simplifying the workflow logic.
  • 247-250: Similarly, the modifications to the nameSortPairedFastqs process to directly accept tuples enhance the workflow's efficiency and clarity. By directly handling tuples, the process becomes more straightforward and aligns with the restructuring efforts aimed at improving the workflow's modularity and maintainability.
  • 272-303: The integration of channel operations for paired and unpaired FASTQ files into the workflow block is a commendable effort towards simplifying the workflow logic. This restructuring makes the handling of different types of FASTQ files more efficient and intuitive, which is crucial for the scalability and maintainability of the project. The use of flatMap and groupBy operations to organize FASTQ files before sorting is a good practice that enhances the workflow's logic and readability.
README.md (2)
  • 48-52: The section on Quickstart with Conda explicitly advises against using Conda for reproducibly running workflows, emphasizing its use primarily for development. This clear guidance helps set expectations for users regarding the recommended environment management tools. However, it's crucial to ensure that this stance is consistent across all documentation and support channels to avoid confusion.

The caution against using Conda for workflow execution is well-placed and contributes to setting clear expectations for users.

  • 196-200: The release notes for version 1.3.0 mention the automatic creation of cached Singularity images by Nextflow and the clarification of Conda's role for development purposes. These notes succinctly communicate the changes and their implications for users. It's important to ensure that these release notes are also reflected in the project's changelog or version history documentation for consistency.

The release notes effectively communicate important changes and their implications for users. Ensure consistency with the project's changelog or version history documentation.

README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ecdd06b and 5ecc597.
Files ignored due to path filters (120)
  • test/reference/test1_paired.bam_fastqs/default_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/default_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/default_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/default_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/default_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/default_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/default_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/default_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/default_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/default_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/default_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/default_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/default_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/default_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/default_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_R1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_R2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_S.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_U1.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_U2.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/default_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/default_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/default_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/default_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/default_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_S.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • test/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U2.sorted.fastq.gz is excluded by: !**/*.gz
Files selected for processing (1)
  • test/test1.sh (3 hunks)
Additional comments: 4
test/test1.sh (4)
  • 13-13: The change in the default value of environmentProfile from "conda" to "singularity" aligns with the PR's objectives and is correctly implemented.
  • 81-81: The updated paths for input files, including a reference directory, are a good practice for organizing test files and are correctly implemented.

Also applies to: 101-101

  • 88-92: The addition of assertions to compare the number of reads in BAM files and output directories is a robust way to verify the correctness of the conversion process and is correctly implemented.

Also applies to: 108-112

  • 123-123: The call to testFinished at the end of the script is correctly implemented, providing a clear summary of the test outcomes.

test/test1.sh Show resolved Hide resolved
Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 5ecc597 and fed0e20.
Files selected for processing (1)
  • README.md (4 hunks)
Additional comments: 3
README.md (3)
  • 48-65: The clarification on the use of Conda primarily for development purposes and the recommendation against using it for running workflows are well articulated. This guidance helps set clear expectations for users regarding the tool's role and encourages best practices for reproducibility. No further refinements are needed here.
  • 196-201: The release notes for version 1.3.0 effectively summarize the key changes, including the automation of cached Singularity image creation and the clarification of Conda's role. These notes align well with the updates made in the README.md file and provide users with a clear understanding of the improvements and patches in this version.
  • 139-140: The instructions for manually pulling the Singularity container are clear and straightforward, providing users with an alternative method to handle the container. This addition is helpful and complements the earlier segment on running the workflow with Singularity.

README.md Outdated Show resolved Hide resolved
Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
@vinjana vinjana changed the title Draft: Migrated to DSL2. Migrated to DSL2. Mar 6, 2024
@vinjana vinjana marked this pull request as draft March 6, 2024 14:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between fed0e20 and d89566d.
Files selected for processing (1)
  • README.md (5 hunks)
Additional comments: 3
README.md (3)
  • 38-47: The warning about the concurrency issue when downloading cached containers is important for users running multiple workflows simultaneously. However, the response from a previous comment indicates that providing a specific solution to this problem is considered beyond the scope of this project. It's understandable given the complexity of the issue, but acknowledging this limitation in the documentation could help set the right expectations for users.

It's good to see that potential concurrency issues are acknowledged. Users should be aware of this limitation and plan their workflows accordingly.

  • 48-52: The note on Conda not being recommended for reproducibly running workflows is a significant shift and aligns with the PR's objectives to promote Singularity for better reproducibility and environment encapsulation. This clarification is helpful for users and developers, setting clear expectations about the supported and recommended tools for running the workflow.

The clarification on the use of Conda primarily for development purposes and not for running workflows is well-placed and aligns with the project's goals for reproducibility.

  • 115-257: The example provided to illustrate the output directory structure and file naming convention is very informative and helps users understand what to expect from the workflow's output. This addition enhances the documentation's clarity and usability, especially for users unfamiliar with BAM and FASTQ file formats.

The inclusion of a detailed example illustrating the output directory structure and file naming convention is a valuable addition to the documentation, enhancing clarity and usability.

README.md Outdated Show resolved Hide resolved
… with the old "lsf" profile, even if additionally the "singularity" profile was used.

Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between d89566d and 1fc9ae0.
Files selected for processing (1)
  • nextflow.config (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • nextflow.config

…s from different profiles.

Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 1fc9ae0 and 76032e0.
Files ignored due to path filters (1)
  • test-environment.yml is excluded by: !**/*.yml
Files selected for processing (1)
  • nextflow.config (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • nextflow.config

@vinjana
Copy link
Member Author

vinjana commented Mar 7, 2024

The current settings seem to be correct, but Nextflow does not what I'd expect of it. With the current nextflow.config -profile conda is ignored, resulting in a CI failure. I filed a bug report.

…ion-testsd/run.sh.

Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
Signed-off-by: Philip R. Kensche <p.kensche@dkfz-heidelberg.de>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 76032e0 and fb45af8.
Files ignored due to path filters (122)
  • integration-tests/reference/test1_paired.bam_fastqs/default_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/default_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/default_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/default_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/default_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run1_gerald_D1VCPACXX_1_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run2_gerald_D1VCPACXX_2_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run3_gerald_D1VCPACXX_3_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run4_gerald_D1VCPACXX_4_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_fastqs/run5_gerald_D1VCPACXX_5_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/default_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/default_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/default_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/default_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/default_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_paired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/default_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/default_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/default_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/default_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/default_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run1_gerald_D1VCPACXX_1_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run2_gerald_D1VCPACXX_2_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run3_gerald_D1VCPACXX_3_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run4_gerald_D1VCPACXX_4_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_R1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_R2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_S.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_U1.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_fastqs/run5_gerald_D1VCPACXX_5_U2.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/default_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/default_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/default_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/default_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/default_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run1_gerald_D1VCPACXX_1_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run2_gerald_D1VCPACXX_2_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run3_gerald_D1VCPACXX_3_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run4_gerald_D1VCPACXX_4_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_R2.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_S.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U1.sorted.fastq.gz is excluded by: !**/*.gz
  • integration-tests/reference/test1_unpaired.bam_sorted_fastqs/run5_gerald_D1VCPACXX_5_U2.sorted.fastq.gz is excluded by: !**/*.gz
  • task-environment.yml is excluded by: !**/*.yml
  • test-environment.yml is excluded by: !**/*.yml
Files selected for processing (9)
  • .circleci/config.yml (2 hunks)
  • .dockerignore (1 hunks)
  • .gitignore (1 hunks)
  • README.md (5 hunks)
  • container-specs/Dockerfile (1 hunks)
  • container-specs/Singularity.def (1 hunks)
  • integration-tests/run.sh (4 hunks)
  • nextflow.config (1 hunks)
  • nf-bam2fastq.iml (1 hunks)
Files skipped from review due to trivial changes (1)
  • .gitignore
Files skipped from review as they are similar to previous changes (1)
  • nextflow.config
Additional comments: 21
.dockerignore (5)
  • 12-12: The addition of .github to the .dockerignore file is appropriate, as it ensures that GitHub-specific metadata and workflows are not included in Docker images, which is generally a good practice for keeping images lean.
  • 14-14: Adding *Env/ to the .dockerignore file is a good practice, as it likely refers to environment directories that should not be included in Docker images. This helps in avoiding unnecessary files in the build context.
  • 15-15: Ignoring integration-tests/ in Docker builds is sensible, especially if these tests are not required to run within the Docker container. This exclusion helps in reducing the size of the Docker image.
  • 16-16: Excluding .run/ directories is generally a good practice, as these directories might contain runtime data or logs that are not necessary for the Docker image.
  • 17-17: Ignoring *.log files is a good practice to keep Docker images clean from log files that might have been generated during development or testing.
container-specs/Singularity.def (1)
  • 1-15: The Singularity.def file is well-documented and correctly sets up the conversion from a Docker image to a Singularity image. Using docker-daemon as the bootstrap method allows for using locally tagged images, which is useful for development and testing. The %environment section correctly sources the Conda environment, ensuring that the necessary environment is activated when the Singularity container is executed. This setup is crucial for the workflow's proper execution and maintainability.
nf-bam2fastq.iml (2)
  • 16-16: Adding an exclusion for integration-tests/output-all is appropriate, as it prevents the IDE from indexing or otherwise interacting with output files generated by integration tests. This can improve the performance of the IDE and keep the project view clean.
  • 22-22: The addition of a library entry for groovy at the project level is a good practice, especially if Groovy is used within the project. This ensures that the IDE correctly recognizes Groovy syntax and can provide relevant code assistance.
container-specs/Dockerfile (1)
  • 1-36: The Dockerfile is well-structured and follows best practices for Docker image creation. Starting from a Miniconda base image is appropriate for Python-based workflows. The use of environment variables for proxy configuration is a good practice, especially in environments behind corporate proxies. The environment setup, including the creation of a Conda environment from task-environment.yml, ensures that all necessary dependencies are installed and isolated within the container. The entry point configuration using conda run with the -p option is a clever workaround for the NoWritableEnvsDirError, ensuring that the Conda environment is correctly activated when the container starts. Overall, this Dockerfile is well-crafted to meet the workflow's requirements.
.circleci/config.yml (3)
  • 20-20: Updating the cache key to include a checksum of tests-environment.yml is a good practice, as it ensures that the cache is invalidated and updated whenever the test environment changes. This helps in maintaining the accuracy and efficiency of CI builds.
  • 26-26: Changing the test script path to integration-tests/run.sh and specifying the output directory and nextflow environment path is appropriate. This ensures that the integration tests are executed correctly and that the results are stored in the specified directory. It's important for maintaining the accuracy of CI test results.
  • 41-41: The job name integration-tests in the workflows section correctly reflects the purpose of the job, which is to run integration tests. This naming convention is clear and helps in understanding the CI configuration.
integration-tests/run.sh (2)
  • 14-17: Setting the default environment profile to singularity and specifying the output directory and nextflow environment path are appropriate changes. These adjustments ensure that the integration tests are executed in the intended environment and that the results are stored correctly. This is important for the accuracy and reproducibility of the tests.
  • 101-127: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [76-126]

The addition of the MD5 checksum comparison loop is a significant improvement to the integration tests. This loop ensures that the output files match the expected reference files, providing a robust method for verifying the correctness of the workflow's output. This addition enhances the reliability and accuracy of the integration tests.

README.md (7)
  • 7-19: The Quickstart with Docker section provides clear instructions for running the workflow using Docker. However, it's important to ensure that the Docker images referenced are kept up-to-date and that users are directed to check for the latest versions. Additionally, consider explicitly mentioning the need for Docker to be installed and running on the user's machine as a prerequisite.
  • Ensure Docker images are up-to-date and mention Docker installation as a prerequisite.
  • 22-47: The Quickstart with Singularity section effectively introduces Singularity as an alternative to Docker for cluster environments. The warning about potential concurrency issues when downloading cached containers is particularly useful. To enhance this section:
  • Consider adding a brief explanation or link on how to install Singularity for users who might be unfamiliar with it.

  • Clarify the $repoDir variable in the code snippet for users who might not understand what it represents.

  • Add Singularity installation guidance and clarify $repoDir.

  • 48-62: The Quickstart with Conda section clearly advises against using Conda for running workflows due to reproducibility concerns, which is a good practice. However, since Conda is mentioned as an option for development:
  • It might be helpful to provide more context on why Conda is suitable for development but not for running workflows, to help users make informed decisions.

  • Ensure that the provided Conda environment file (task-environment.yml) is up-to-date and includes all necessary dependencies for development purposes.

  • Provide more context on Conda's suitability for development and ensure the Conda environment file is up-to-date.

  • 4-69: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [68-73]

The Remarks section offers valuable insights into the workflow's behavior and considerations. To further improve this section:

  • Consider providing links to external tools mentioned (e.g., biobambam2, UNIX coreutils) for users who may want to explore these tools further.

  • Clarify the implications of sorting FASTQ files by their IDs and how it affects subsequent processing steps, possibly with a brief example or scenario.

  • Provide links to external tools and clarify the implications of sorting FASTQ files.

  • 280-290: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [258-287]

The Environment and Execution section provides a comprehensive overview of Nextflow's -profile parameter and the predefined profiles. To enhance this section:

  • Clarify the distinction between "environment" and "executor" profiles for users who may not be familiar with these concepts.

  • Include an example of how to change the default storage locations for Conda environments and Singularity containers, as this could be a common task for users managing their workflow installations.

  • Clarify profile distinctions and include an example of changing default storage locations.

  • 313-355: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [299-336]

The Development section provides clear instructions for running integration tests and the container release process. To further refine this section:

  • Clarify the $profile variable in the integration test command to help users understand how to specify the environment for the tests.

  • For the container release process, consider adding a note about ensuring that the user has the necessary permissions to push containers to the Github Container Registry.

  • Clarify the $profile variable in integration tests and add a note on permissions for container release.

  • 341-351: The Release Notes section is well-organized and provides a clear changelog for the project. To improve this section:
  • Ensure consistency in the presentation of version numbers and release dates. For example, avoid using commas in dates (e.g., "March, 2024" should be "March 2024").

  • Consider linking to relevant issues or pull requests for each listed change to provide users with more context and the ability to explore specific updates in detail.

  • Ensure consistency in date formatting and consider linking to relevant issues or pull requests.

integration-tests/
.run/
*.log
*Env/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of *Env/ again at the end of the .dockerignore file seems to be a duplicate entry. This redundancy does not impact the functionality but could be cleaned up for clarity and maintainability.

- *Env/

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
*Env/

@vinjana vinjana marked this pull request as ready for review March 8, 2024 16:17
@vinjana vinjana changed the title Migrated to DSL2. Solve some Singularity issues. Updates. DSL2 Migration. Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate to DSL2 Fix Singularity container startup
1 participant