Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: display seqvars query execution results in UI (#1952) #1957

Merged
merged 6 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 128 additions & 0 deletions backend/cases_import/tests/snapshots/snap_test_models_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,131 @@
"sex": 2,
},
]

snapshots["ImportCreateWithSeqvarsVcfTest::test_run external files"] = [
{
"available": None,
"designation": "variant_calls",
"file_attributes": {
"checksum": "sha256:4042c2afa59f24a327b3852bfcd0d8d991499d9c4eb81e7a7efe8d081e66af82",
"designation": "variant_calls",
"genomebuild": "grch37",
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"variant_type": "seqvars",
},
"identifier_map": {"index": "NA12878-PCRF450-1"},
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"path": "file://cases_import/tests/data/sample-brca1.vcf.gz",
},
{
"available": None,
"designation": "variant_calls",
"file_attributes": {
"checksum": "sha256:6b137335b7803623c3389424e7b64d704fb1c9f3f55792db2916d312e2da27ef",
"designation": "variant_calls",
"genomebuild": "grch37",
"mimetype": "application/octet-stream+x-tabix-tbi-index",
"variant_type": "seqvars",
},
"identifier_map": {"index": "NA12878-PCRF450-1"},
"mimetype": "application/octet-stream+x-tabix-tbi-index",
"path": "file://cases_import/tests/data/sample-brca1.vcf.gz.tbi",
},
]

snapshots["ImportCreateWithSeqvarsVcfTest::test_run internal files"] = [
{
"checksum": None,
"designation": "variant_calls/seqvars/orig-copy",
"file_attributes": {
"checksum": "sha256:4042c2afa59f24a327b3852bfcd0d8d991499d9c4eb81e7a7efe8d081e66af82",
"designation": "variant_calls",
"genomebuild": "grch37",
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"variant_type": "seqvars",
},
"identifier_map": {"index": "NA12878-PCRF450-1"},
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"path": "case-data/7a/1d7b28-2bf8-4340-81f3-5487d86c669f/c28a70a6-1c75-40a1-8d89-216ca16cffca/seqvars/external-copy-0.vcf.gz",
},
{
"checksum": None,
"designation": "variant_calls/seqvars/ingested-vcf",
"file_attributes": {},
"identifier_map": {},
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"path": "case-data/7a/1d7b28-2bf8-4340-81f3-5487d86c669f/c28a70a6-1c75-40a1-8d89-216ca16cffca/seqvars/ingested.vcf.gz",
},
{
"checksum": None,
"designation": "variant_calls/seqvars/ingested-tbi",
"file_attributes": {},
"identifier_map": {},
"mimetype": "application/octet-stream+x-tabix-tbi-index",
"path": "case-data/7a/1d7b28-2bf8-4340-81f3-5487d86c669f/c28a70a6-1c75-40a1-8d89-216ca16cffca/seqvars/ingested.vcf.gz.tbi",
},
]
Comment on lines +67 to +98
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Checksum assignments not found in the codebase.

The snapshots show that the checksum field is None for all entries in ImportCreateWithSeqvarsVcfTest::test_run internal files. Additionally, the codebase does not assign values to file_attributes["checksum"], which may result in missing checksum validations.

Please ensure that checksum values are correctly calculated and assigned to maintain data integrity.

🔗 Analysis chain

Well-structured snapshot for internal files, with a few points to verify.

The snapshot for ImportCreateWithSeqvarsVcfTest::test_run internal files is well-organized and includes the expected files (original copy, ingested VCF, and index). The use of UUIDs in paths is a good practice for ensuring unique storage per case.

However, please verify the following:

  1. The checksum field is None for all entries. Is this intentional?
  2. The file_attributes and identifier_map are empty for ingested files. Is this expected behavior?

Re-running verification to check checksum implementations.


Further verification needed for checksum assignments.

To ensure that checksum values are being correctly calculated and assigned, please run the following script:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify if the checksum calculation is implemented for internal files
rg --type python 'def calculate_checksum|checksum =' backend/cases_import/

Length of output: 107


Script:

#!/bin/bash
# Verify if the checksum calculation is implemented for internal files by searching for 'def calculate_checksum' or 'checksum =' in Python files
rg --glob '*.py' 'def calculate_checksum|checksum =' backend/cases_import/

Length of output: 167


Script:

#!/bin/bash
# Search for assignments to file_attributes["checksum"] in Python files
rg --glob '*.py' 'file_attributes\["checksum"\]\s*=' backend/cases_import/

Length of output: 76


snapshots["ImportCreateWithStrucvarsVcfTest::test_run external files"] = [
{
"available": None,
"designation": "variant_calls",
"file_attributes": {
"checksum": "sha256:4042c2afa59f24a327b3852bfcd0d8d991499d9c4eb81e7a7efe8d081e66af82",
"designation": "variant_calls",
"genomebuild": "grch37",
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"variant_type": "strucvars",
},
"identifier_map": {"index": "NA12878-PCRF450-1"},
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"path": "file://cases_import/tests/data/sample-brca1.vcf.gz",
},
{
"available": None,
"designation": "variant_calls",
"file_attributes": {
"checksum": "sha256:6b137335b7803623c3389424e7b64d704fb1c9f3f55792db2916d312e2da27ef",
"designation": "variant_calls",
"genomebuild": "grch37",
"mimetype": "application/octet-stream+x-tabix-tbi-index",
"variant_type": "strucvars",
},
"identifier_map": {"index": "NA12878-PCRF450-1"},
"mimetype": "application/octet-stream+x-tabix-tbi-index",
"path": "file://cases_import/tests/data/sample-brca1.vcf.gz.tbi",
},
]

snapshots["ImportCreateWithStrucvarsVcfTest::test_run internal files"] = [
{
"checksum": None,
"designation": "variant_calls/strucvars/orig-copy",
"file_attributes": {
"checksum": "sha256:4042c2afa59f24a327b3852bfcd0d8d991499d9c4eb81e7a7efe8d081e66af82",
"designation": "variant_calls",
"genomebuild": "grch37",
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"variant_type": "strucvars",
},
"identifier_map": {"index": "NA12878-PCRF450-1"},
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"path": "case-data/7a/1d7b28-2bf8-4340-81f3-5487d86c669f/c28a70a6-1c75-40a1-8d89-216ca16cffca/strucvars/external-copy-0.vcf.gz",
},
{
"checksum": None,
"designation": "variant_calls/strucvars/ingested-vcf",
"file_attributes": {},
"identifier_map": {},
"mimetype": "text/plain+x-bgzip+x-variant-call-format",
"path": "case-data/7a/1d7b28-2bf8-4340-81f3-5487d86c669f/c28a70a6-1c75-40a1-8d89-216ca16cffca/strucvars/ingested.vcf.gz",
},
{
"checksum": None,
"designation": "variant_calls/strucvars/ingested-tbi",
"file_attributes": {},
"identifier_map": {},
"mimetype": "application/octet-stream+x-tabix-tbi-index",
"path": "case-data/7a/1d7b28-2bf8-4340-81f3-5487d86c669f/c28a70a6-1c75-40a1-8d89-216ca16cffca/strucvars/ingested.vcf.gz.tbi",
},
]
27 changes: 27 additions & 0 deletions backend/ext_gestaltmatcher/migrations/0003_auto_20241009_0639.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Generated by Django 3.2.25 on 2024-10-09 06:39

from django.db import migrations, models


class Migration(migrations.Migration):

dependencies = [
("ext_gestaltmatcher", "0002_smallvariantquerypediascores"),
]

operations = [
migrations.AlterField(
model_name="smallvariantquerygestaltmatcherscores",
name="id",
field=models.BigAutoField(
auto_created=True, primary_key=True, serialize=False, verbose_name="ID"
),
),
migrations.AlterField(
model_name="smallvariantquerypediascores",
name="id",
field=models.BigAutoField(
auto_created=True, primary_key=True, serialize=False, verbose_name="ID"
),
),
]
20 changes: 20 additions & 0 deletions backend/protos/seqvars/protos/output.proto
Original file line number Diff line number Diff line change
Expand Up @@ -166,12 +166,32 @@ message GeneRelatedConsequences {
repeated seqvars.pbs.query.Consequence consequences = 3;
}

// Enumerations with modes of inheritance from HPO.
enum ModeOfInheritance {
// Unspecified mode of inheritance.
MODE_OF_INHERITANCE_UNSPECIFIED = 0;
// Autosomal dominant inheritance (HP:0000006).
MODE_OF_INHERITANCE_AUTOSOMAL_DOMINANT = 1;
// Autosomal recessive inheritance (HP:0000007).
MODE_OF_INHERITANCE_AUTOSOMAL_RECESSIVE = 2;
// X-linked dominant inheritance (HP:0001419).
MODE_OF_INHERITANCE_X_LINKED_DOMINANT = 3;
// X-linked recessive inheritance (HP:0001423).
MODE_OF_INHERITANCE_X_LINKED_RECESSIVE = 4;
// Y-linked inheritance (HP:0001450).
MODE_OF_INHERITANCE_Y_LINKED = 5;
// Mitochondrial inheritance (HP:0001427).
MODE_OF_INHERITANCE_MITOCHONDRIAL = 6;
}

// Phenotype-related information, if any.
message GeneRelatedPhenotypes {
// ACMG supplementary finding list.
bool is_acmg_sf = 1;
// Whether is a known disease gene.
bool is_disease_gene = 2;
// Linked modes of inheritance.
repeated ModeOfInheritance mode_of_inheritances = 3;
}

// Gene-wise constraints.
Expand Down
32 changes: 32 additions & 0 deletions backend/seqvars/migrations/0010_auto_20240905_1017.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Generated by Django 3.2.25 on 2024-09-05 10:17

from django.db import migrations, models


class Migration(migrations.Migration):

dependencies = [
("seqvars", "0009_alter_seqvarsresultrow_genome_release"),
]

operations = [
migrations.AlterModelOptions(
name="seqvarsqueryexecutionbackgroundjob",
options={"ordering": ["-pk"]},
),
migrations.AlterModelOptions(
name="seqvarsresultrow",
options={"ordering": ["chrom_no", "pos", "ref_allele", "alt_allele"]},
),
migrations.AlterModelOptions(
name="seqvarsresultset",
options={"ordering": ["-date_created"]},
),
migrations.AlterField(
model_name="seqvarsresultrow",
name="genome_release",
field=models.CharField(
choices=[("grch37", "GRCh37"), ("grch38", "GRCh38")], max_length=32
),
),
]
30 changes: 26 additions & 4 deletions backend/seqvars/models/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2055,13 +2055,26 @@ class GeneRelatedConsequencesPydantic(pydantic.BaseModel):
consequences: list[SeqvarsVariantConsequenceChoice]


class SeqvarsModeOfInheritance(str, Enum):
"""Mode of inheritance gene annotation."""

AUTOSOMAL_DOMINANT = "autosomal_dominant"
AUTOSOMAL_RECESSIVE = "autosomal_recessive"
X_LINKED_DOMINANT = "x_linked_dominant"
X_LINKED_RECESSIVE = "x_linked_recessive"
Y_LINKED = "y_linked"
MITOCHONDRIAL = "mitochondrial"


class GeneRelatedPhenotypesPydantic(pydantic.BaseModel):
"""Phenotype-related information, if any."""

#: ACMG supplementary finding list.
is_acmg_sf: bool = False
#: Whether is a known disease gene.
is_disease_gene: bool = False
#: Modes of inheritance.
mode_of_inheritances: list[SeqvarsModeOfInheritance] = []


class GnomadConstraintsPydantic(pydantic.BaseModel):
Expand Down Expand Up @@ -2161,13 +2174,13 @@ class GeneRelatedAnnotationPydantic(pydantic.BaseModel):
"""Store gene-related annotation (always for a single gene)."""

#: Gene ID information.
identity: GeneIdentityPydantic
identity: typing.Optional[GeneIdentityPydantic]
#: Gene-related consequences, if any (none if intergenic).
consequences: GeneRelatedConsequencesPydantic
consequences: typing.Optional[GeneRelatedConsequencesPydantic]
#: Gene-related phenotype information, if any.
phenotypes: GeneRelatedPhenotypesPydantic
phenotypes: typing.Optional[GeneRelatedPhenotypesPydantic]
#: Gene-wise constraints on the gene, if any.
constraints: GeneRelatedConstraintsPydantic
constraints: typing.Optional[GeneRelatedConstraintsPydantic]


class SeqvarsNuclearFrequencyPydantic(pydantic.BaseModel):
Expand Down Expand Up @@ -2373,6 +2386,9 @@ def get_absolute_url(self) -> str:
def __str__(self):
return f"SeqvarsResultSet '{self.sodar_uuid}'"

class Meta:
ordering = ["-date_created"]


class SeqvarsResultRow(models.Model):
"""One entry in the result set."""
Expand Down Expand Up @@ -2414,6 +2430,9 @@ def __str__(self):
f"{self.pos}-{self.ref_allele}-{self.alt_allele}'"
)

class Meta:
ordering = ["chrom_no", "pos", "ref_allele", "alt_allele"]


class SeqvarsQueryExecutionBackgroundJobManager(models.Manager):
"""Custom manager class that allows to create a ``SeqvarsQueryExeuctionBackgroundJob``
Expand Down Expand Up @@ -2478,3 +2497,6 @@ class SeqvarsQueryExecutionBackgroundJob(JobModelMessageMixin, models.Model):

def get_human_readable_type(self):
return self.task_desc

class Meta:
ordering = ["-pk"]
23 changes: 23 additions & 0 deletions backend/seqvars/models/protobufs.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
SeqvarsGnomadMitochondrialFrequencySettingsPydantic,
SeqvarsHelixMtDbFrequencyPydantic,
SeqvarsHelixMtDbFrequencySettingsPydantic,
SeqvarsModeOfInheritance,
SeqvarsNuclearFrequencyPydantic,
SeqvarsNuclearFrequencySettingsPydantic,
SeqvarsOutputHeaderPydantic,
Expand Down Expand Up @@ -84,6 +85,7 @@
GnomadConstraints,
GnomadMitochondrialFrequency,
HelixMtDbFrequency,
ModeOfInheritance,
NuclearFrequency,
OutputHeader,
OutputRecord,
Expand Down Expand Up @@ -772,10 +774,31 @@ def _consequences_from_protobuf(
)


MODE_OF_INHERITANCE_MAPPING: dict[
ModeOfInheritance.ValueType : SeqvarsVariantScoreColumnTypeChoice
] = {
ModeOfInheritance.MODE_OF_INHERITANCE_AUTOSOMAL_DOMINANT: SeqvarsModeOfInheritance.AUTOSOMAL_DOMINANT,
ModeOfInheritance.MODE_OF_INHERITANCE_AUTOSOMAL_RECESSIVE: SeqvarsModeOfInheritance.AUTOSOMAL_RECESSIVE,
ModeOfInheritance.MODE_OF_INHERITANCE_X_LINKED_DOMINANT: SeqvarsModeOfInheritance.X_LINKED_DOMINANT,
ModeOfInheritance.MODE_OF_INHERITANCE_X_LINKED_RECESSIVE: SeqvarsModeOfInheritance.X_LINKED_RECESSIVE,
ModeOfInheritance.MODE_OF_INHERITANCE_Y_LINKED: SeqvarsModeOfInheritance.Y_LINKED,
ModeOfInheritance.MODE_OF_INHERITANCE_MITOCHONDRIAL: SeqvarsModeOfInheritance.MITOCHONDRIAL,
}


def _mode_of_inheritance_from_protobuf(
mode_of_inheritance: ModeOfInheritance.ValueType,
) -> SeqvarsModeOfInheritance:
return MODE_OF_INHERITANCE_MAPPING[mode_of_inheritance]


def _phenotypes_from_protobuf(phenotypes: GeneRelatedPhenotypes) -> GeneRelatedPhenotypesPydantic:
return GeneRelatedPhenotypesPydantic(
is_acmg_sf=phenotypes.is_acmg_sf,
is_disease_gene=phenotypes.is_disease_gene,
mode_of_inheritances=list(
map(_mode_of_inheritance_from_protobuf, phenotypes.mode_of_inheritances)
),
)


Expand Down
Loading
Loading