Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding tmb calculation #403

Merged
merged 8 commits into from
Jun 22, 2023
Merged

feat: Adding tmb calculation #403

merged 8 commits into from
Jun 22, 2023

Conversation

giacuong171
Copy link
Contributor

No description provided.

@giacuong171 giacuong171 linked an issue Jun 16, 2023 that may be closed by this pull request
@github-actions
Copy link

  • Please format your Python code with black: make black
  • Please format your Snakemake code with snakefmt: make snakefmt
  • Please organize your imports isorts: make isort
  • Please ensure that your code passes flake8: make flake8

You can trigger all lints locally by running make lint

Copy link
Contributor

@ericblanc20 ericblanc20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is very good already, but there are a few things to be changed, see the comments below.

The code must be cleaned up to follow the formatting rules laid by Manuel. You need to use black (with line length 100) & flake8 on your code. The order of imports is checked with isort and snakefmt ensures consistent Snakefile format.

Finally, you will need to write tests for you step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to be pedantic and ask you to change the name of the step. It should be tumor_mutational_burden, not tmb_calculation.

Likewise, the naming pattern for output files should be {mapper}.{caller}.tmb.{tumor_library} or {mapper}.{caller}.tumor_mutational_burden.{tumor_library}, but not {mapper}.{caller}.tmb_calculation_json.{tumor_library}.

(There are only two hard things in Computer Science: cache invalidation and naming things, Phil Karlton (from X11))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action is usually run. In steps with many stages (such as somatic variant calling, with the calling itself, the contamination estimation and filtering) the action can take different values. But is simple steps as tumor_mutational_burden, it should be run.

config, lookup_paths, config_paths = expand_ref("config.yaml", config)

# WorkflowImpl Object Setup ===================================================
wf = TumorMutaionalBurdenCalculationWorkflow(workflow,config,lookup_paths,config_paths,os.getcwd())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: it should be TumorMutationalBurdenCalculationWorkflow.

tools_somatic_variant_calling: [] # default to those configured for somatic_variant_calling
"""

class TumorMutaionalBurdenCalculationStepPart(BaseStepPart):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: should be TumorMutationalBurdenCalculationStepPart

"""Calculation tumor mutational burden for each sample"""
name = 'tmb_gathering'

actions = ("tmb_gathering",)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actions should be ("run",)

actions_str = ", ".join(self.actions)
error_message = f"Action '{action}' is not supported. Valid options: {actions_str}"
raise UnsupportedActionException(error_message)
mem_mb = 7 * 1024 * 2 #read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are requesting too many resources: 1 core, 4GB & 1 hour should be enough

tpl, tumor_library=[sample_pair.tumor_sample.dna_ngs_library], **kwargs
)

def _yield_result_files_joint(self, tpl, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that you need to worry about joint callers. This method could go.

path_somatic_variant_calling: ../somatic_variant_calling # REQUIRED
tools_ngs_mapping: [] # default to those configured for ngs_mapping
tools_somatic_variant_calling: [] # default to those configured for somatic_variant_calling
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user must have control on the bed file for the regions considered for TMB. It must be part of the configuration.

#number_indels=$(bcftools view -R {snakemake.input.bed_file} -v indels -H {snakemake.input.vcf}| wc -l)
TMB=`echo "1000000*($number_variants/$total_exom_length)" | bc -l `
TMB_rounded=`printf "%.3f" $TMB`
to_json=$(cat <<EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify:

cat << EOF > {snakemake.output.json}
...
EOF

@ericblanc20 ericblanc20 changed the title Adding tmb calculation feat: Adding tmb calculation Jun 16, 2023
Copy link
Contributor

@ericblanc20 ericblanc20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comments:

  • thanks for starting to format your code according to black & flake8 rules
  • I think renaming things has improved readability for the user
  • still a bit of work around the wrapper, according to our discussion
  • writing tests is the last hurdle.

Copy link
Contributor

@ericblanc20 ericblanc20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good now.

There is a few more cosmetic changes to make black & flake8 happy. For the former, you should be able to make those changes automatically with the command black --line-length 100 ., provided that the version of black is the same that on the CI container.

As soon as all checks are passed I'll accept all changes.

@coveralls
Copy link

Coverage Status

coverage: 85.045% (+0.09%) from 84.96% when pulling 4e82fee on 402-tmb-calculation into 427803c on main.

:raises UnsupportedActionException: if action not in class defined list of valid actions.
"""
if action not in self.actions:
actions_str = ", ".join(self.actions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, you may want to consider self._validate_action(action)

@ericblanc20 ericblanc20 merged commit 706a9f5 into main Jun 22, 2023
@ericblanc20 ericblanc20 deleted the 402-tmb-calculation branch June 22, 2023 10:11
@tedil tedil mentioned this pull request Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TMB calculation
3 participants