Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGVF-2381-genomic-elements #1335

Merged
merged 13 commits into from
Feb 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions src/igvfd/schemas/changelogs/reference_file.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
## Changelog for *`reference_file.json`*

### Schema version 16

* Adjust `content_type` enum list to remove `regulatory_regions`.
* Adjust `content_type` enum list to remove `regulatory_regions_genes`.
* Adjust `content_type` enum list to remove `regulatory_regions_genes_biosamples`.
* Adjust `content_type` enum list to remove `regulatory_regions_genes_biosamples_donors`.
* Adjust `content_type` enum list to remove `regulatory_regions_genes_biosamples_treatments_chebi`.
* Adjust `content_type` enum list to remove `regulatory_regions_genes_biosamples_treatments_proteins`.
* Adjust `content_type` enum list to remove `regulatory_regions_regulatory_regions`.
* Adjust `content_type` enum list to remove `variants_regulatory_regions`.
* Extend `content_type` enum list to include `genomic_elements` for admins.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think you need to say "for admins"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it since users can't use this property

* Extend `content_type` enum list to include `genomic_elements_genes` for admins.
* Extend `content_type` enum list to include `genomic_elements_genes_biosamples` for admins.
* Extend `content_type` enum list to include `genomic_elements_genes_biosamples_donors` for admins.
* Extend `content_type` enum list to include `genomic_elements_genes_biosamples_treatments_chebi` for admins.
* Extend `content_type` enum list to include `genomic_elements_genes_biosamples_treatments_proteins` for admins.
* Extend `content_type` enum list to include `genomic_elements_genomic_elements` for admins.
* Extend `content_type` enum list to include `variants_genomic_elements` for admins.

### Minor changes since schema version 15

* Extend `transcriptome_annotation` enum list to include `GENCODE 22`.
Expand Down
18 changes: 9 additions & 9 deletions src/igvfd/schemas/reference_file.json
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@
"type": "object",
"properties": {
"schema_version": {
"default": "15"
"default": "16"
},
"content_type": {
"comment": "Content Type describes the content of the file.",
Expand Down Expand Up @@ -172,13 +172,13 @@
"pathways_pathways",
"proteins",
"proteins_proteins",
"regulatory_regions",
"regulatory_regions_genes",
"regulatory_regions_genes_biosamples",
"regulatory_regions_genes_biosamples_donors",
"regulatory_regions_genes_biosamples_treatments_chebi",
"regulatory_regions_genes_biosamples_treatments_proteins",
"regulatory_regions_regulatory_regions",
"genomic_elements",
"genomic_elements_genes",
"genomic_elements_genes_biosamples",
"genomic_elements_genes_biosamples_donors",
"genomic_elements_genes_biosamples_treatments_chebi",
"genomic_elements_genes_biosamples_treatments_proteins",
"genomic_elements_genomic_elements",
"sequence barcodes",
"studies",
"studies_variants",
Expand All @@ -199,7 +199,7 @@
"variants_proteins_terms",
"variants_proteins_biosamples",
"variants_proteins_phenotypes",
"variants_regulatory_regions",
"variants_genomic_elements",
"variants_variants",
"vector sequences"
],
Expand Down
4 changes: 2 additions & 2 deletions src/igvfd/tests/data/inserts/reference_file.json
Original file line number Diff line number Diff line change
Expand Up @@ -211,13 +211,13 @@
"uuid": "6d1a1811-2133-474d-b3b1-a07ad1f7e4e1",
"accession": "IGVFFI6791GUEE",
"aliases": [
"igvf:tsv_regulatory_regions_regulatory_regions"
"igvf:tsv_genomic_elements_genomic_elements"
],
"lab": "j-michael-cherry",
"award": "HG012012",
"md5sum": "ee570198468d172dc018978a49b7db53",
"file_format": "tsv",
"content_type": "regulatory_regions_regulatory_regions",
"content_type": "genomic_elements_genomic_elements",
"file_format_specifications": [
"igvf:file_format_specification_insert"
],
Expand Down
80 changes: 80 additions & 0 deletions src/igvfd/tests/fixtures/schemas/reference_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,3 +155,83 @@ def reference_file_v14(reference_file_v6):
'external_id': 'ENCFF743WOO'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions_genes(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions_genes'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions_genes_biosamples(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions_genes_biosamples'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions_genes_biosamples_donors(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions_genes_biosamples_donors'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions_genes_biosamples_treatments_chebi(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions_genes_biosamples_treatments_chebi'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions_genes_biosamples_treatments_proteins(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions_genes_biosamples_treatments_proteins'
})
return item


@pytest.fixture
def reference_file_v15_regulatory_regions_regulatory_regions(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'regulatory_regions_regulatory_regions'
})
return item


@pytest.fixture
def reference_file_v15_variants_regulatory_regions(reference_file_v6):
item = reference_file_v6.copy()
item.update({
'schema_version': '15',
'content_type': 'variants_regulatory_regions'
})
return item
56 changes: 56 additions & 0 deletions src/igvfd/tests/test_upgrade_reference_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,59 @@ def test_reference_file_upgrade_14_15(upgrader, reference_file_v14):
value = upgrader.upgrade('reference_file', reference_file_v14, current_version='14', target_version='15')
assert 'external_id' not in value
assert value['schema_version'] == '15'


def test_reference_file_upgrade_15_16_regulatory_regions(upgrader, reference_file_v15_regulatory_regions):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_regulatory_regions_genes(upgrader, reference_file_v15_regulatory_regions_genes):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions_genes,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements_genes'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_regulatory_regions_genes_biosamples(upgrader, reference_file_v15_regulatory_regions_genes_biosamples):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions_genes_biosamples,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements_genes_biosamples'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_regulatory_regions_genes_biosamples_donors(upgrader, reference_file_v15_regulatory_regions_genes_biosamples_donors):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions_genes_biosamples_donors,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements_genes_biosamples_donors'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_regulatory_regions_genes_biosamples_treatments_chebi(upgrader, reference_file_v15_regulatory_regions_genes_biosamples_treatments_chebi):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions_genes_biosamples_treatments_chebi,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements_genes_biosamples_treatments_chebi'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_regulatory_regions_genes_biosamples_treatments_proteins(upgrader, reference_file_v15_regulatory_regions_genes_biosamples_treatments_proteins):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions_genes_biosamples_treatments_proteins,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements_genes_biosamples_treatments_proteins'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_regulatory_regions_regulatory_regions(upgrader, reference_file_v15_regulatory_regions_regulatory_regions):
value = upgrader.upgrade('reference_file', reference_file_v15_regulatory_regions_regulatory_regions,
current_version='15', target_version='16')
assert value['content_type'] == 'genomic_elements_genomic_elements'
assert value['schema_version'] == '16'


def test_reference_file_upgrade_15_16_variants_regulatory_regions(upgrader, reference_file_v15_variants_regulatory_regions):
value = upgrader.upgrade('reference_file', reference_file_v15_variants_regulatory_regions,
current_version='15', target_version='16')
assert value['content_type'] == 'variants_genomic_elements'
assert value['schema_version'] == '16'
46 changes: 46 additions & 0 deletions src/igvfd/upgrade/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -393,3 +393,49 @@ def sequence_file_14_15_alignment_file_12_13(value, system):
# Coerce values like 28.0 to ints.
if k in value:
value[k] = int(value[k])


@upgrade_step('reference_file', '15', '16')
def reference_file_15_16(value, system):
# https://igvf.atlassian.net/browse/IGVF-2381
notes = value.get('notes', '')
if value['content_type'] == 'regulatory_regions':
value['content_type'] = 'genomic_elements'
notes += f' This file\'s content_type was regulatory_regions, but has been upgraded to genomic_elements.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'regulatory_regions_genes':
value['content_type'] = 'genomic_elements_genes'
notes += f' This file\'s content_type was regulatory_regions_genes, but has been upgraded to genomic_elements_genes.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'regulatory_regions_genes_biosamples':
value['content_type'] = 'genomic_elements_genes_biosamples'
notes += f' This file\'s content_type was regulatory_regions_genes_biosamples, but has been upgraded to genomic_elements_genes_biosamples.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'regulatory_regions_genes_biosamples_donors':
value['content_type'] = 'genomic_elements_genes_biosamples_donors'
notes += f' This file\'s content_type was regulatory_regions_genes_biosamples_donors, but has been upgraded to genomic_elements_genes_biosamples_donors.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'regulatory_regions_genes_biosamples_treatments_chebi':
value['content_type'] = 'genomic_elements_genes_biosamples_treatments_chebi'
notes += f' This file\'s content_type was regulatory_regions_genes_biosamples_treatments_chebi, but has been upgraded to genomic_elements_genes_biosamples_treatments_chebi.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'regulatory_regions_genes_biosamples_treatments_proteins':
value['content_type'] = 'genomic_elements_genes_biosamples_treatments_proteins'
notes += f' This file\'s content_type was regulatory_regions_genes_biosamples_treatments_proteins, but has been upgraded to genomic_elements_genes_biosamples_treatments_proteins.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'regulatory_regions_regulatory_regions':
value['content_type'] = 'genomic_elements_genomic_elements'
notes += f' This file\'s content_type was regulatory_regions_regulatory_regions, but has been upgraded to genomic_elements_genomic_elements.'
if notes.strip() != '':
value['notes'] = notes.strip()
if value['content_type'] == 'variants_regulatory_regions':
value['content_type'] = 'variants_genomic_elements'
notes += f' This file\'s content_type was variants_regulatory_regions, but has been upgraded to variants_genomic_elements.'
if notes.strip() != '':
value['notes'] = notes.strip()