Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update scheduler script #1372

Merged
merged 47 commits into from
May 13, 2024
Merged

fix: Update scheduler script #1372

merged 47 commits into from
May 13, 2024

Conversation

ivadym
Copy link
Contributor

@ivadym ivadym commented Jan 26, 2024

Description

Refactors the scheduler script and removes the qsub option, currently non-functional in production. If we choose to reintroduce it in the future, the process will be straightforward as the code and models are easily extendable.

This guide explains the need for overwriting the Snakemake scheduler and could help in understanding the code execution flow: https://r3.pages.uni.lu/school/snakemake-tutorial/HPC#immediate_submit

Changed

  • Cluster scheduler script for immediate submit

Removed

  • SGE support

Documentation

  • N/A
  • Updated Balsamic documentation to reflect the changes as needed for this PR.
    • user_guide.rst

Tests

Feature Tests

  • N/A
  • Test successful job submission
    • TGA T/O case
      • ~ Submission* time this PR: 3m2.062s,
      • ~ Submission* time master: 2m23.637s
    • WGS T/N case
      • ~ Submission* time this PR: 3m48.636s,
      • ~ Submission* time master: 2m47.884s
    • Manual submission Balsamic command
  • Test correct generation of the slurm_jobids.yaml file (for job parsing by trailblazer)
    Screenshot 2024-02-02 at 14 17 12
  • cg workflow balsamic start <case_id>

Pipeline Integrity Tests

  • Report deliver (generation of the .hk file)
    • N/A
    • Verified
  • TGA T/O Workflow
    • N/A
    • Verified
  • TGA T/N Workflow
    • N/A
    • Verified
  • UMI T/O Workflow
    • N/A
    • Verified
  • UMI T/N Workflow
    • N/A
    • Verified
  • WGS T/O Workflow
    • N/A
    • Verified
  • WGS T/N Workflow
    • N/A
    • Verified
  • QC Workflow
    • N/A
    • Verified
  • PON Workflow
    • N/A
    • Verified

Clinical Genomics Stockholm

Documentation

  • Atlas documentation
    • N/A
    • Updated: [Link]
  • Web portal for Clinical Genomics
    • N/A
    • Updated: [Link]

User Changes

  • N/A
  • This PR affects the output files or results.
    • User feedback is considered unnecessary because [Justification].
    • Affected users have been included in the development process and given a chance to provide feedback.

Infrastructure Changes

  • Stored files in Housekeeper
    • N/A
    • Updated: [Link]
  • CG (CLI and delivered/uploaded files)
    • N/A
    • Updated: [Link]
  • Servers (configuration files on Hasta)
    • N/A
    • Updated: [Link]
  • Scout interface
    • N/A
    • Updated: [Link]

Integration Tests

  • N/A
  • Test [Description]
    • [Screenshot]

Checklist

Important

Ensure that all checkboxes below are ticked before merging.

For Developers

  • PR Description
    • Provided a comprehensive description of the PR.
    • Linked relevant user stories or issues to the PR.
  • Documentation
    • Verified and updated documentation if necessary.
  • Tests
    • Described and tested the functionality addressed in the PR.
    • Ensured integration of the new code with existing workflows.
    • Confirmed that meaningful unit tests were added for the changes introduced.
    • Checked that the PR has successfully passed all relevant code smells and coverage checks.
  • Review
    • Addressed and resolved all the feedback provided during the code review process.
    • Obtained final approval from designated reviewers.

For Reviewers

  • Code
    • Code implements the intended features or fixes the reported issue.
    • Code follows the project's coding standards and style guide.
  • Documentation
    • Pipeline changes are well-documented in the CHANGELOG and relevant documentation.
  • Tests
    • The author provided a description of their manual testing, including consideration of edge cases and boundary
      conditions where applicable, with satisfactory results.
  • Review
    • Confirmed that the developer has addressed all the comments during the code review.

Sorry, something went wrong.

@ivadym ivadym requested a review from a team as a code owner January 26, 2024 17:44
@ivadym ivadym self-assigned this Jan 26, 2024
@ivadym ivadym linked an issue Jan 26, 2024 that may be closed by this pull request
Copy link

sonarqubecloud bot commented Feb 1, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@mathiasbio mathiasbio requested a review from fevac February 2, 2024 10:04
@ivadym ivadym mentioned this pull request Feb 8, 2024
29 tasks
ivadym added 4 commits April 25, 2024 12:14
@ivadym ivadym requested a review from mathiasbio May 8, 2024 11:48
@mathiasbio
Copy link
Collaborator

mathiasbio commented May 8, 2024

Can I ask, beyond the refactoring and improvements in the code, are there any other expected effects from this change on the practical running of the analyses? For instance do we expect that it will affect the key_error in some way?

I have just run it once and the submission failed due to a key error so it doesn't look good... I hoped it was something to do with the scheduler but seems more and more a bug in how Snakemake handles finished jobs for immediate submit

How has this been resolved? Is it just that we increased the sleep? : )

Copy link
Collaborator

@mathiasbio mathiasbio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing effort 🌟 👍 👍

Do you think there will be any effects on production from the increased submission times in this update? Will it decrease the need for sleep time because it's slower?

ivadym and others added 2 commits May 13, 2024 10:01
Co-authored-by: Mathias Johansson <math.joh.bio@gmail.com>
Co-authored-by: Mathias Johansson <math.joh.bio@gmail.com>
@ivadym
Copy link
Contributor Author

ivadym commented May 13, 2024

Can I ask, beyond the refactoring and improvements in the code, are there any other expected effects from this change on the practical running of the analyses? For instance do we expect that it will affect the key_error in some way?

I have just run it once and the submission failed due to a key error so it doesn't look good... I hoped it was something to do with the scheduler but seems more and more a bug in how Snakemake handles finished jobs for immediate submit

How has this been resolved? Is it just that we increased the sleep? : )

Unfortunately nothing has changed regarding the Key Error :(

I think if we want to get rid of it we would need to fork snakemake and catch the Key Error

@ivadym
Copy link
Contributor Author

ivadym commented May 13, 2024

Amazing effort 🌟 👍 👍

Do you think there will be any effects on production from the increased submission times in this update? Will it decrease the need for sleep time because it's slower?

I think it could make it worse. Slower submission times, even by just around one minute, will increase the chances of encountering a Key Error. Should be OK since we also increased the sleep time but I will test it once more to make sure the submission times align with what I have tested before.

We can scrap this PR, but if we would like to add memory allocation in the future this PR would make it way easier

ivadym added 3 commits May 13, 2024 10:18
Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@mathiasbio
Copy link
Collaborator

Amazing effort 🌟 👍 👍
Do you think there will be any effects on production from the increased submission times in this update? Will it decrease the need for sleep time because it's slower?

I think it could make it worse. Slower submission times, even by just around one minute, will increase the chances of encountering a Key Error. Should be OK since we also increased the sleep time but I will test it once more to make sure the submission times align with what I have tested before.

We can scrap this PR, but if we would like to add memory allocation in the future this PR would make it way easier

I'm in favour of deploying it, I don't think adding a few more minutes to the sleep rule would be an issue anyway! Probably better to be able to add memory allocation!

@ivadym
Copy link
Contributor Author

ivadym commented May 13, 2024

I've tested it again, same results as the ones I included before

@ivadym ivadym merged commit d800054 into develop May 13, 2024
8 checks passed
@ivadym ivadym deleted the update-scheduler-script branch May 13, 2024 13:20
@mathiasbio mathiasbio mentioned this pull request Oct 17, 2024
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Intermittent KeyError Disrupts Slurm Job Submissions
2 participants