-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #219 from nextstrain/mpox-update-2024-07
Add mpox clade I dataset
- Loading branch information
Showing
18 changed files
with
33,327 additions
and
18,550 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
## Unreleased | ||
|
||
Initial release of this dataset. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Nextclade dataset for "Mpox virus (Clade I)" | ||
|
||
| Key | Value | | ||
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| authors | [Cornelius Roemer](https://neherlab.org), [Richard Neher](https://neherlab.org), [Nextstrain](https://nextstrain.org) | | ||
| data source | Genbank | | ||
| workflow | [github.com/nextstrain/mpox/nextclade](https://github.com/nextstrain/mpox/nextclade) | | ||
| nextclade dataset path | nextstrain/mpox/clade-i | | ||
| reference | [DQ011155.1](https://www.ncbi.nlm.nih.gov/nuccore/DQ011155.1), isolate `Zaire_1979-005`, an early complete clade I sequence | | ||
| annotation | based on [DQ011155.1](https://www.ncbi.nlm.nih.gov/nuccore/DQ011155.1), but with genes called by modern names (OPGXXX) | | ||
| clade definitions | [github.com/mpxv-lineages/lineage-designation](https://github.com/mpxv-lineages/lineage-designation) | | ||
| related datasets | Mpox virus (All clades): `nextstrain/mpox/all-clades`<br>Mpox virus (clade IIb) `nextstrain/mpox/clade-iib`<br>Mpox virus (Lineage B.1 within clade IIb) `nextstrain/mpox/lineage-b.1` | | ||
|
||
## Scope of this dataset | ||
|
||
This dataset is for Mpox viruses of clade I (Ia and Ib). A broader dataset for all clades I, IIa and IIb is available under `nextstrain/mpox/all-clades`. | ||
|
||
## Reference sequence and reference tree | ||
|
||
The reference used in this dataset is [DQ011155.1](https://www.ncbi.nlm.nih.gov/nuccore/DQ011155.1), an early complete clade I sequence (Isolate `Zaire_1979-005`). | ||
|
||
This is in contrast to the reference used in the other Nextclade mpox datasets, which use a clade IIb reference sequence. | ||
|
||
The reference tree consists of all good quality clade I sequences available within Genbank at the time of dataset creation (with identical sequences deduplicated to 1), as well as 3 outgroup genomes (a reconstructed ancestor of all clades, and one sequence for each of clade IIa and clade IIb). | ||
|
||
## Further reading | ||
|
||
Read more about Nextclade datasets in the Nextclade documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
{ | ||
"alignmentParams": { | ||
"excessBandwidth": 100, | ||
"terminalBandwidth": 300, | ||
"allowedMismatches": 8, | ||
"windowSize": 40, | ||
"minSeedCover": 0.1, | ||
"gapAlignmentSide": "left" | ||
}, | ||
"attributes": { | ||
"name": "Mpox virus (Clade I)", | ||
"reference accession": "DQ011155.1", | ||
"reference name": "Zaire_1979-005" | ||
}, | ||
"compatibility": { | ||
"cli": "3.0.0-alpha.0", | ||
"web": "3.0.0-alpha.0" | ||
}, | ||
"deprecated": false, | ||
"enabled": true, | ||
"experimental": false, | ||
"files": { | ||
"changelog": "CHANGELOG.md", | ||
"examples": "sequences.fasta", | ||
"genomeAnnotation": "genome_annotation.gff3", | ||
"pathogenJson": "pathogen.json", | ||
"readme": "README.md", | ||
"reference": "reference.fasta", | ||
"treeJson": "tree.json" | ||
}, | ||
"official": true, | ||
"qc": { | ||
"frameShifts": { | ||
"enabled": true, | ||
"ignoredFrameShifts": [ | ||
], | ||
"scoreWeight": 20 | ||
}, | ||
"missingData": { | ||
"enabled": true, | ||
"missingDataThreshold": 20000, | ||
"scoreBias": 1000 | ||
}, | ||
"mixedSites": { | ||
"enabled": true, | ||
"mixedSitesThreshold": 40 | ||
}, | ||
"privateMutations": { | ||
"cutoff": 50, | ||
"enabled": true, | ||
"typical": 5, | ||
"weightLabeledSubstitutions": 6, | ||
"weightReversionSubstitutions": 6, | ||
"weightUnlabeledSubstitutions": 1 | ||
}, | ||
"snpClusters": { | ||
"clusterCutOff": 5, | ||
"enabled": true, | ||
"scoreWeight": 10, | ||
"windowSize": 1000 | ||
}, | ||
"stopCodons": { | ||
"enabled": true, | ||
"ignoredStopCodons": [ | ||
], | ||
"scoreWeight": 40 | ||
} | ||
}, | ||
"schemaVersion": "3.0.0", | ||
"shortcuts": [ | ||
], | ||
"version": { | ||
"tag": "unreleased" | ||
} | ||
} |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.