Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wish List Nov 2024 #394

Closed
emmahodcroft opened this issue Nov 19, 2024 · 1 comment
Closed

Wish List Nov 2024 #394

emmahodcroft opened this issue Nov 19, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@emmahodcroft
Copy link
Collaborator

Some ideas for CoVariants, priority/order shouldn't be taken as set in stone!

Except this one, high priority:

  1. Stabilize website so there's no more risk of failing to build (may include storing data elsewhere)

Other ideas:

  • Improve graphs

    • Allow zooming (there's some work on this here)
    • Possibly making turning on/off variants easier
    • Improve the legend (currently shows all null/0 values so it often runs off page)
    • We currently have to store & then plot all null/0s or else the graphs don't plot correctly - this makes the files less efficient & leads to the annoying legends mentioned above
  • Allow website to be more customizable

    • Allow users to pick from any available variants in Shared Mutations (customize visible columns)
    • Allow users to show different genes outside of Spike for Shared Mutations (this information is currently already in clusters.py but could be copied out to different files like it is for Spike currently)
    • Allow users to flip between showing the Nextstrain name ("23I") and the Pango name ("BA.2.86") across the website - ideally some kind of toggle at the top (or that moves with scroll) so they could turn this on/off easily anywhere and menus/plots would adjust (not necessary for page text or tables to adjust - I think). These are linked in clusters.py
    • Allow users to specify what variants they'd like to show on the home page left-hand menu (lower priority)
  • Allow better defining mutations

    • I currently manually curate defining mutations for new variants I add, there's no other resources that lists these that I know of. These files are found here (only on Github)
    • I'll continue to manually curate for variants I track as this ensure the 'big ones' are absolutely 100% correct, but this information would be cool to a) display better & b) automatically generate for all variants (again, not available anywhere that I know of, but incredibly useful)
    • Ideally we'd have auto-generated ones for all variants & this would be "overwritten" by a manual file if available (the manual file would display instead if detected)
    • Some starting work on this was here, with a very ugly preview of the idea here -- ideas to make this prettier welcome!
    • Cornelius would help us with writing the script to generate the files (he would probably generate them somewhere as part of his other workflows and we would pull them in)
  • Integrate better/expand to other pathogens?

    • There's a 'frequencies' app by Neher lab that does flu (and in the works, some other stuff) (github) -- we'd need a longer convo to talk about the differences and complexities here but in theory it might be nice to align with them so that CoV plots could eventually be shown on this page, and in theory maybe we could both expand to other viruses

Backend stuff:

  • Potentially 'freeze' older data

    • CoV was tracking variants long before 'Variants of Concern' existed, so I track some individual mutations (a bit of a crazy idea nowadays) and some variants that were never 'official'. This means a lot of overhead:
      • For current variants I simply follow Nextstrain & then can benefit from simply using their classification (already done in the files I receive) to partition into variants
      • However, for older ones they don't have the correct (for me) Nextstrain classification so I need to identify them by checking lists of mutations - this is very inefficient and takes a long time
      • I don't want to change how I currently count or plot the past as a) I do think these pre-variants are potentially genuinely interesting b) many people will have already built stuff expecting this to be stable
      • But it's very unlikely at this point that people are going to be uploading enough new sequences from 2020/2021 that this majorly changes the graphs. Thus, we could 'freeze' early data and not re-calculate older years everytime we re-run.
  • Improve backend efficiency generally

    • With or without the above, running faster would be nice
    • Reduce redundancy -- currently mut lists are often in 2-3 places, would be nice to have them in one place and then just get them for where they're needed
    • Currently we rely on the display_name from clusters.py far too broadly and for too many things, and this makes things very inflexible and brittle... this should be modified (more convo/digging needed!)
    • Adding a new variant could probably be made streamlined and more easy
    • Set up automatic updates that go to staging?
@emmahodcroft emmahodcroft added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers needs triage Pending maintainers' attention labels Nov 19, 2024
@AdvancedCodingMonkey AdvancedCodingMonkey removed help wanted Extra attention is needed good first issue Good for newcomers needs triage Pending maintainers' attention labels Nov 28, 2024
@AdvancedCodingMonkey
Copy link
Collaborator

Split into into separate issues to tackle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants