-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support for somatic variant calling without normal #503
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, most of the changes are fine. I only have some comments regarding user feedback, configuration validation, wrapper parameters and some code suggestions.
With respect to the tests: it would be very nice if we had a test which covers the no-normal case, because that is the new functionality proposed in this PR.
snappy_pipeline/workflows/somatic_variant_annotation/__init__.py
Outdated
Show resolved
Hide resolved
snappy_pipeline/workflows/somatic_variant_filtration/__init__.py
Outdated
Show resolved
Hide resolved
snappy_pipeline/workflows/somatic_variant_filtration/__init__.py
Outdated
Show resolved
Hide resolved
snappy_pipeline/workflows/somatic_variant_filtration/__init__.py
Outdated
Show resolved
Hide resolved
… extraction type is not found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor changes left, otherwise: looks good and thanks for even incorporating the wrapper-replace-config-with-params-access changes!
self._write_step_config() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be removed before merging (or only done on specific log levels?)
@@ -155,6 +155,7 @@ | |||
tools_somatic_variant_calling: null # Default: use those defined in somatic_variant_calling step | |||
tools_somatic_variant_annotation: null # Default: use those defined in somatic_variant_annotation step | |||
has_annotation: True | |||
filtration_schema: "sets" # Either "sets" (old scheme- filter_sets) or "list" (new scheme- filter_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the "sets" variant is the deprecated one anyways, why not put the "list" one as default?
(Also, why is sets plural?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default is changed. sets
is because of filter_sets
, I didn't want to break the config unnecessarily. But I'll change it if you feel it's important.
normal_library = self.tumor_to_normal_library.get(wildcards["tumor_library"], None) | ||
if normal_library: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fyi, since python 3.8 this can be written as
if normal_library := self.tumor_to_normal_library.get(wildcards["tumor_library"], None):
# […]
using the "walrus" operator :=
msg = "Only one include or exclude expression is allowed in {} filter {} (configuration: {})" | ||
assert len(keywords) == 1, msg.format(self.filter_name, filter_nb, keywords) | ||
keyword = list(keywords.keys())[0] | ||
msg = 'Unknown keyword "{}" in {} filter {} (allowed values: include, exclude, configuration: {})' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads as if "configuration" was also an allowed value. Perhaps write allowed values: "include", "exclude". Configuration: {}
or something similar instead?
) | ||
assert len(keywords) == 1, msg.format(self.filter_name, filter_nb, keywords) | ||
keyword = list(keywords.keys())[0] | ||
msg = 'Unknown keyword "{}" in {} filter {} (allowed values: include, exclude, configuration: {})' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above + missing "path_bed" as a value that's allowed
parameters = parent(wildcards) | ||
filter_nb = int(wildcards["filter_nb"]) | ||
keywords = self.config["filter_list"][filter_nb - 1][self.filter_name] | ||
msg = "Only one protected region is allowed in {} filter {} (configuration: {})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it "one protected region" or "one protected region BED file"?
@@ -546,14 +637,21 @@ def _get_log_file(self, action): | |||
name_pattern = "{mapper}.{var_caller}" | |||
if self.config["has_annotation"]: | |||
name_pattern += ".{annotator}" | |||
name_pattern += ".dkfz_bias_filter.{tumor_library}" | |||
name_pattern += ".dkfz_bias_filter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps name_pattern += f".{self.name}"
to avoid issues when someone (for whatever reason) decides to rename the filter name.
name_pattern_1 = name_pattern + ( | ||
r".dkfz_bias_filter.eb_filter.{tumor_library}.{filter_set}.{exon_list}" | ||
) | ||
name_pattern_2 = name_pattern + ( | ||
r".dkfz_bias_filter.eb_filter.{tumor_library}.{filter_set}.{exon_list}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these different?
… set to list, config output under verbose control
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last thing, then we're good to go
This is a first attempt to support tumor-only samples in the somatic snappy branch.
Many points are still open for discussion/criticism:
The proposed overhaul/replacement of biomedsheets should alter how the samples are connected to each others. So points 2 & 3 (and possibly 4 too) may be revisited when we have a plan for the new system.
The CNV steps need complete re-write anyway. First WES & WGS should be unified, then the problem with reference creation in the panel of normals step should be addressed (WGS, excluding hard-to-map regions, connection with exome panel naming, ...). The possibility of CNV calling without normals should be included in this wider discussion.