`varfish_export_mehari_bam_qc` fails for WGS workflow due to undefined exon bed file #511

Nicolai-vKuegelgen · 2024-05-28T13:23:20Z

Describe the bug
The varfish_export_mehari_bam_qc step uses the mehari bam_qc wrapper script, which is simply a link to the varfish annotator bam_qc wrapper. This wrapper needs the samtools idx/bam/flagstats files as well as the alfreq_qc json files as input.

The function that gathers the mehari bam qc input files (see below) contains a check to see if the library name can be mapped to a library kit before it adds the alfreq_qc file to the input. This mapping fails if no bed file for coverage analysis is provided in the ngs_mapping/target_coverage_report/path_target_interval_list_mapping config section. For WGS data this section is usually empty, since coverage does not need to be restricted to specific sections. However this results in the alfred_qc not being used as input in the varfish_export_mehari_bam_qc step and thus causing a crash.

To Reproduce
Steps to reproduce the behavior:

Run latest snappy version with no matching kit defined in step_config/ngs_mapping/target_coverage_report/path_target_interval_list_mapping
Run varfish_export step
See error

Expected behavior
Preferably the mehari bam_qc input file function needs to be adapted to always include the alfred_qc file independently of kit matches.
Alternatively, an way to define a bed file for WGS data (that does not negatively impact coverage calculation) needs to implemented.

Additional context
Mehari bam qc input file function:

    @dictify
    def _get_input_files_bam_qc(self, wildcards):
        ngs_mapping = self.parent.sub_workflows["ngs_mapping"]
        # Get names of primary libraries of the selected pedigree.  The pedigree is selected
        # by the primary DNA NGS library of the index.
        pedigree = self.index_ngs_library_to_pedigree[wildcards.index_ngs_library]
        result = {"bamstats": [], "flagstats": [], "idxstats": [], "alfred_qc": []}
        for donor in pedigree.donors:
            if not donor.dna_ngs_library:
                continue
            tpl = (
                f"output/{wildcards.mapper}.{donor.dna_ngs_library.name}/report/bam_qc/"
                f"{wildcards.mapper}.{donor.dna_ngs_library.name}.bam.%s.txt"
            )
            for key in ("bamstats", "flagstats", "idxstats"):
                result[key].append(ngs_mapping(tpl % key))
            if donor.dna_ngs_library.name not in self.parent.ngs_library_to_kit:
                continue
            path = (
                f"output/{wildcards.mapper}.{donor.dna_ngs_library.name}/report/alfred_qc/"
                f"{wildcards.mapper}.{donor.dna_ngs_library.name}.alfred.json.gz"
            )
            result["alfred_qc"].append(ngs_mapping(path))
        return result

The text was updated successfully, but these errors were encountered:

Nicolai-vKuegelgen · 2024-05-28T13:25:17Z

@holtgrewe
Do you think the lines checking that a kit is defined for a given sample can safely be deleted here?
I assume with the switch to alfred_qc there should be no need, since both WES & WGS data should have the same coverage report output files (i.e. alfred_qc json), but I am not 100% sure.

Specifically removing these lines should fix the export step for WGS data:

            if donor.dna_ngs_library.name not in self.parent.ngs_library_to_kit:
                continue

…#511

…#511 (#516)

Nicolai-vKuegelgen added a commit that referenced this issue Jun 6, 2024

fix: Always include alfred qc files in input for varfish export bam QC …

39538d0

…#511

Nicolai-vKuegelgen linked a pull request Jun 6, 2024 that will close this issue

fix: Always include alfred qc files in input for varfish export bam QC #511 #516

Merged

Nicolai-vKuegelgen mentioned this issue Jun 6, 2024

fix: Always include alfred qc files in input for varfish export bam QC #511 #516

Merged

Nicolai-vKuegelgen closed this as completed in #516 Jun 6, 2024

Nicolai-vKuegelgen added a commit that referenced this issue Jun 6, 2024

fix: Always include alfred qc files in input for varfish export bam QC …

d13e595

…#511 (#516)

tedil mentioned this issue Jun 28, 2024

chore(main): release 0.1.0 #520

Merged

tedil pushed a commit that referenced this issue Aug 2, 2024

fix: Always include alfred qc files in input for varfish export bam QC …

037c200

…#511 (#516)

This was referenced Dec 9, 2024

chore(main): release 1.0.0 #571

Closed

chore(main): release 1.0.0 #572

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`varfish_export_mehari_bam_qc` fails for WGS workflow due to undefined exon bed file #511

`varfish_export_mehari_bam_qc` fails for WGS workflow due to undefined exon bed file #511

Nicolai-vKuegelgen commented May 28, 2024

Nicolai-vKuegelgen commented May 28, 2024

varfish_export_mehari_bam_qc fails for WGS workflow due to undefined exon bed file #511

varfish_export_mehari_bam_qc fails for WGS workflow due to undefined exon bed file #511

Comments

Nicolai-vKuegelgen commented May 28, 2024

Nicolai-vKuegelgen commented May 28, 2024

`varfish_export_mehari_bam_qc` fails for WGS workflow due to undefined exon bed file #511

`varfish_export_mehari_bam_qc` fails for WGS workflow due to undefined exon bed file #511