Skip to content

Commit

Permalink
Fixup: Add date annotations for rare genotypes
Browse files Browse the repository at this point in the history
Six of the samples that are force-included in the Nextclade dataset tree have empty collection date fields in the metadata output from NCBI Datasets. This results in the samples being removed downstream by the TreeTime clock filter. This commit adds collection dates (which were manually extracted from the strain names in the NCBI metadata) for these samples so that they will be included in the Nextclade dataset tree.
  • Loading branch information
kimandrews committed Jun 10, 2024
1 parent dc3cd4b commit cd15009
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions ingest/defaults/annotations.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,13 @@ U64582 date 1988-XX-XX
X84865 date 1994-XX-XX
X84872 date 1990-XX-XX
X84879 date 1971-XX-XX
#
# Strains with rare genotypes
# Dates are retrieved from epi-weeks reported within strain names on NCBI
# These are force-included in the nextclade tree to boost representation of rare genotypes
AF410989 date 1987-03-09 # genotype E
AY037009 date 2000-06-12 # genotype G2
AY037043 date 2000-04-17 # genotype H2
AY037026 date 1997-03-24 # genotype H2
AY037028 date 2000-03-13 # genotype D2
FJ668380 date 2003-02-10 # genotype D10

0 comments on commit cd15009

Please sign in to comment.