-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #244 from nextstrain/2024-11-23_flu-update
flu: update subclades with most recent proposals
- Loading branch information
Showing
45 changed files
with
10,384 additions
and
3,278 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1,695 changes: 848 additions & 847 deletions
1,695
data/nextstrain/flu/h1n1pdm/ha/CY121680/sequences.fasta
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1,788 changes: 894 additions & 894 deletions
1,788
data/nextstrain/flu/h1n1pdm/ha/MW626062/sequences.fasta
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1,434 changes: 717 additions & 717 deletions
1,434
data/nextstrain/flu/h3n2/ha/CY163680/sequences.fasta
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1,602 changes: 801 additions & 801 deletions
1,602
data/nextstrain/flu/h3n2/ha/EPI1857216/sequences.fasta
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletions
31
data_output/nextstrain/flu/h1n1pdm/ha/CY121680/unreleased/CHANGELOG.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
## Unreleased | ||
|
||
- update reference trees | ||
- include subclade D.5 (included as proposed clades on 2024-11-05) | ||
|
||
## 2024-11-05T09:19:52Z | ||
|
||
- update reference trees | ||
- include subclade proposals | ||
|
||
## 2024-07-03T08:29:55Z | ||
|
||
- add representative samples from early pandemic-era clades including 1, 2, 3, 4, 6C, 7, and 8 to improve clade label annotations for older sequences | ||
|
||
- added configuration of current and recent vaccine strains as 'reference nodes' on the reference tree, against which query sequences can be compared. This feature is in addition to the new 'compare to clade founder' feature, allowing to compare each query sequence to the most ancestral node of a clade or lineage. See Nextclade documentation for more details about 'relative mutations' functionality. | ||
|
||
## 2024-04-19T07:50:39Z | ||
|
||
- aliasing of C.1.1.1 as D | ||
- addition of subclades D.1 - D.4: [D.1](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/D.1.yml), [D.2](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/D.2.yml), [D.3](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/D.3.yml), [D.4](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/D.4.yml) | ||
- addition of subclades [C.1.8](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/C.1.8.yml) and [C.1.9](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/C.1.9.yml) | ||
- addition of subclades [C.1.7.1](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/C.1.7.1.yml) and [C.1.7.2](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/C.1.7.2.yml) | ||
|
||
|
||
## 2024-01-16T20:31:02Z | ||
|
||
Initial release for Nextclade v3! | ||
|
||
- addition of subclade [C.1.7](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/blob/main/subclades/C.1.7.yml) | ||
|
||
Read more about Nextclade datasets in the documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html |
40 changes: 40 additions & 0 deletions
40
data_output/nextstrain/flu/h1n1pdm/ha/CY121680/unreleased/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Influenza A(H1N1pdm) HA based on reference "A/California/07/2009" | ||
|
||
| Key | Value | | ||
| -------------------- | -------------------- | | ||
| authors | [Richard Neher](https://neherlab.org), [Nextstrain](https://nextstrain.org) | | ||
| name | Influenza A(H1N1pdm) HA | | ||
| reference | A/California/07/2009 | | ||
| dataset path | flu/h1n1pdm/ha/CY121680 | | ||
| reference accession | CY121680 | | ||
| clade definitions | [github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/) | | ||
|
||
|
||
## Scope of this dataset | ||
This dataset uses an older reference sequence (A/California/07/2009) and recent sequences will differ at a large number of positions from this reference. | ||
For the analysis of currently circulating viruses, the dataset using A/Wisconsin/588/2019 as reference might be more appropriate. | ||
|
||
## Features | ||
This dataset supports | ||
|
||
* Assignment to clades and subclades based on the nomenclature defined in [github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/) | ||
* Identification of glycosilation motifs | ||
* Sequence QC | ||
* Phylogenetic placement | ||
|
||
## Clades of seasonal influenza viruses | ||
|
||
The WHO Collaborating centers define "clades" as genetic groups of viruses with signature mutations to facilitate discussion of circulating diversity of the viruses. | ||
Clade demarcation do not always coincide with significantly different antigenic properties of the viruses. | ||
Clade names are structured as _Number-Letter_ binomials separated by periods as in `6B.1A.5a.2a.1`. These sometimes get shortened by omission of leading binomials like `5a.2a.1`. | ||
|
||
In addition to these clades, "subclades" are defined to break down diversity at higher resolution and allow following the spread of different viral groups. | ||
These follow a Pango-like nomenclature consisting of a letter followed by a numbers separated by periods as in `C.1.2`. | ||
The leading letter is an alias of a previous name. | ||
Details of the nomenclature system can be found at [github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/](https://github.com/influenza-clade-nomenclature/seasonal_A-H1N1pdm_HA/). | ||
|
||
|
||
|
||
## What is Nextclade dataset | ||
|
||
Read more about Nextclade datasets in Nextclade documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html |
Binary file added
BIN
+1.16 MB
data_output/nextstrain/flu/h1n1pdm/ha/CY121680/unreleased/dataset.zip
Binary file not shown.
5 changes: 5 additions & 0 deletions
5
data_output/nextstrain/flu/h1n1pdm/ha/CY121680/unreleased/genome_annotation.gff3
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
##gff-version 3 | ||
##sequence-region CY121680.1 1 1752 | ||
CY121680.1 feature gene 21 71 . + . gene_name="SigPep" | ||
CY121680.1 feature gene 72 1052 . + . gene_name="HA1" | ||
CY121680.1 feature gene 1053 1718 . + . gene_name="HA2" |
126 changes: 126 additions & 0 deletions
126
data_output/nextstrain/flu/h1n1pdm/ha/CY121680/unreleased/pathogen.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
{ | ||
"schemaVersion": "3.0.0", | ||
"alignmentParams": { | ||
"excessBandwidth": 9, | ||
"terminalBandwidth": 100, | ||
"allowedMismatches": 4, | ||
"gapAlignmentSide": "right", | ||
"minSeedCover": 0.1 | ||
}, | ||
"compatibility": { | ||
"cli": "3.0.0-alpha.0", | ||
"web": "3.0.0-alpha.0" | ||
}, | ||
"defaultCds": "HA1", | ||
"files": { | ||
"changelog": "CHANGELOG.md", | ||
"examples": "sequences.fasta", | ||
"genomeAnnotation": "genome_annotation.gff3", | ||
"pathogenJson": "pathogen.json", | ||
"readme": "README.md", | ||
"reference": "reference.fasta", | ||
"treeJson": "tree.json" | ||
}, | ||
"qc": { | ||
"privateMutations": { | ||
"enabled": true, | ||
"typical": 5, | ||
"cutoff": 15, | ||
"weightLabeledSubstitutions": 2, | ||
"weightReversionSubstitutions": 1, | ||
"weightUnlabeledSubstitutions": 1 | ||
}, | ||
"missingData": { | ||
"enabled": false, | ||
"missingDataThreshold": 100, | ||
"scoreBias": 10 | ||
}, | ||
"snpClusters": { | ||
"enabled": false, | ||
"windowSize": 100, | ||
"clusterCutOff": 5, | ||
"scoreWeight": 50 | ||
}, | ||
"mixedSites": { | ||
"enabled": true, | ||
"mixedSitesThreshold": 4 | ||
}, | ||
"frameShifts": { | ||
"enabled": true | ||
}, | ||
"stopCodons": { | ||
"enabled": true, | ||
"ignoredStopCodons": [] | ||
} | ||
}, | ||
"cdsOrderPreference": [ | ||
"HA1", | ||
"HA2" | ||
], | ||
"maintenance": { | ||
"website": [ | ||
"https://nextstrain.org", | ||
"https://clades.nextstrain.org" | ||
], | ||
"documentation": [ | ||
"https://github.com/nextstrain/seasonal-flu" | ||
], | ||
"source code": [ | ||
"https://github.com/nextstrain/seasonal_flu" | ||
], | ||
"issues": [ | ||
"https://github.com/nextstrain/seasonal_flu/issues" | ||
], | ||
"organizations": [ | ||
"Nextstrain" | ||
], | ||
"authors": [ | ||
"Nextstrain team <https://nextstrain.org>" | ||
] | ||
}, | ||
"nucMutLabelMap": {}, | ||
"nucMutLabelMapReverse": {}, | ||
"shortcuts": [ | ||
"flu_h1n1pdm_ha_broad", | ||
"nextstrain/flu/h1n1pdm/ha/california-7-2009" | ||
], | ||
"aaMotifs": [ | ||
{ | ||
"name": "glycosylation", | ||
"nameShort": "Glyc.", | ||
"nameFriendly": "Glycosylation", | ||
"description": "N-linked glycosylation motifs (N-X-S/T with X any amino acid other than P)", | ||
"includeCdses": [ | ||
{ | ||
"cds": "HA1", | ||
"ranges": [] | ||
}, | ||
{ | ||
"cds": "HA2", | ||
"ranges": [ | ||
{ | ||
"begin": 0, | ||
"end": 186 | ||
} | ||
] | ||
} | ||
], | ||
"motifs": [ | ||
"N[^P][ST]" | ||
] | ||
} | ||
], | ||
"attributes": { | ||
"name": "Influenza A H1N1pdm HA", | ||
"segment": "ha", | ||
"reference accession": "CY121680", | ||
"reference name": "A/California/7/2009-egg" | ||
}, | ||
"version": { | ||
"tag": "unreleased", | ||
"compatibility": { | ||
"cli": "3.0.0-alpha.0", | ||
"web": "3.0.0-alpha.0" | ||
} | ||
} | ||
} |
2 changes: 2 additions & 0 deletions
2
data_output/nextstrain/flu/h1n1pdm/ha/CY121680/unreleased/reference.fasta
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
>CY121680.1 Influenza A virus (A/California/07/2009(H1N1)) hemagglutinin (HA) gene, complete cds | ||
GGAAAACAAAAGCAACAAAAATGAAGGCAATACTAGTAGTTCTGCTATATACATTTGCAACCGCAAATGCAGACACATTATGTATAGGTTATCATGCGAACAATTCAACAGACACTGTAGACACAGTACTAGAAAAGAATGTAACAGTAACACACTCTGTTAACCTTCTAGAAGACAAGCATAACGGGAAACTATGCAAACTAAGAGGGGTAGCCCCATTGCATTTGGGTAAATGTAACATTGCTGGCTGGATCCTGGGAAATCCAGAGTGTGAATCACTCTCCACAGCAAGCTCATGGTCCTACATTGTGGAAACACCTAGTTCAGACAATGGAACGTGTTACCCAGGAGATTTCATCGATTATGAGGAGCTAAGAGAGCAATTGAGCTCAGTGTCATCATTTGAAAGGTTTGAGATATTCCCCAAGACAAGTTCATGGCCCAATCATGACTCGAACAAAGGTGTAACGGCAGCATGTCCTCATGCTGGAGCAAAAAGCTTCTACAAAAATTTAATATGGCTAGTTAAAAAAGGAAATTCATACCCAAAGCTCAGCAAATCCTACATTAATGATAAAGGGAAAGAAGTCCTCGTGCTATGGGGCATTCACCATCCATCTACTAGTGCTGACCAACAAAGTCTCTATCAGAATGCAGATGCATATGTTTTTGTGGGGTCATCAAGATACAGCAAGAAGTTCAAGCCGGAAATAGCAATAAGACCCAAAGTGAGGGATCGAGAAGGGAGAATGAACTATTACTGGACACTAGTAGAGCCGGGAGACAAAATAACATTCGAAGCAACTGGAAATCTAGTGGTACCGAGATATGCATTCGCAATGGAAAGAAATGCTGGATCTGGTATTATCATTTCAGATACACCAGTCCACGATTGCAATACAACTTGTCAAACACCCAAGGGTGCTATAAACACCAGCCTCCCATTTCAGAATATACATCCGATCACAATTGGAAAATGTCCAAAATATGTAAAAAGCACAAAATTGAGACTGGCCACAGGATTGAGGAATATCCCGTCTATTCAATCTAGAGGCCTATTTGGGGCCATTGCCGGTTTCATTGAAGGGGGGTGGACAGGGATGGTAGATGGATGGTACGGTTATCACCATCAAAATGAGCAGGGGTCAGGATATGCAGCCGACCTGAAGAGCACACAGAATGCCATTGACGAGATTACTAACAAAGTAAATTCTGTTATTGAAAAGATGAATACACAGTTCACAGCAGTAGGTAAAGAGTTCAACCACCTGGAAAAAAGAATAGAGAATTTAAATAAAAAAGTTGATGATGGTTTCCTGGACATTTGGACTTACAATGCCGAACTGTTGGTTCTATTGGAAAATGAAAGAACTTTGGACTACCACGATTCAAATGTGAAGAACTTATATGAAAAGGTAAGAAGCCAGCTAAAAAACAATGCCAAGGAAATTGGAAACGGCTGCTTTGAATTTTACCACAAATGCGATAACACGTGCATGGAAAGTGTCAAAAATGGGACTTATGACTACCCAAAATACTCAGAGGAAGCAAAATTAAACAGAGAAGAAATAGATGGGGTAAAGCTGGAATCAACAAGGATTTACCAGATTTTGGCGATCTATTCAACTGTCGCCAGTTCATTGGTACTGGTAGTCTCCCTGGGGGCAATCAGTTTCTGGATGTGCTCTAATGGGTCTCTACAGTGTAGAATATGTATTTAACATTAGGATTTCAGAAGCATGAGAAAAACAC |
Oops, something went wrong.