Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of RFC64: 'mutational signatures' in the backend #9913

Conversation

MatthijsPon
Copy link
Contributor

@MatthijsPon MatthijsPon commented Dec 2, 2022

Implementation of RFC#64 in the backend

Describe changes proposed in this pull request:

  • Addition of a standalone scripts which takes in a study path and generates mutational signature files following RFC64.
  • Mutational matrices are constructed from a maf file in the given study path.
  • Create mutational signature contributions and P-values per COSMIC signature using MSKCC's tempoSig.

TODO's:

  • Possible implementation of other mutational signature algorithms
  • Convert count matrices to generic assay files and create accompanying meta files
  • No test created yet, validateData.py needs to be extended

Checks

  • Runs on heroku
  • Has tests or has a separate issue that describes the types of test that should be created. If no test is included it should explicitly be mentioned in the PR why there is no test.
  • The commit log is comprehensible. It follows 7 rules of great commit messages. For most PRs a single commit should suffice, in some cases multiple topical commits can be useful. During review it is ok to see tiny commits (e.g. Fix reviewer comments), but right before the code gets merged to master or rc branch, any such commits should be squashed since they are useless to the other developers. Definitely avoid merge commits, use rebase instead.
  • Is this PR adding logic based on one or more clinical attributes? If yes, please make sure validation for this attribute is also present in the data validation / data loading layers (in backend repo) and documented in File-Formats Clinical data section!

@SRodenburg SRodenburg self-requested a review December 2, 2022 12:55
Copy link

@SRodenburg SRodenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice start, I proposed some minor restructuring to make the code more intuitive.

I feel like we should move the temposig outside of the script, and let the user give an installation directory.

Another thing, I discourage the removal of directories in the script. Just make clear in the names of the generated tmp files that they are temporary.

core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
@MatthijsPon MatthijsPon self-assigned this Dec 12, 2022
@MatthijsPon MatthijsPon force-pushed the rfc64_mutational_signatures_backend branch from 3d918d6 to dd4d46d Compare February 17, 2023 15:27
@MatthijsPon MatthijsPon force-pushed the rfc64_mutational_signatures_backend branch from dd4d46d to 1ef34cf Compare February 17, 2023 15:39
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@pvannierop
Copy link
Contributor

Copy link

@SRodenburg SRodenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work Matthijs,

Some code suggestions and a few spots where I think we can improve on clarity, structure or efficiency, as indicated.

It would be nice to add these scripts in a subfolder mutational_signatures to keep this together.
Also, how can we work with the SigProfiler package in the Docker image?

core/src/main/scripts/mutationalSignatures.py Outdated Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
core/src/main/scripts/mutationalSignatures.py Show resolved Hide resolved
@MatthijsPon
Copy link
Contributor Author

MatthijsPon commented Mar 1, 2023

Implementation moved to cbioportal/datahub-study-curation-tools: PR#48

@MatthijsPon MatthijsPon closed this Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants