Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sequence details page's UI #1465

Open
7 of 15 tasks
chaoran-chen opened this issue Mar 25, 2024 · 7 comments
Open
7 of 15 tasks

Improve sequence details page's UI #1465

chaoran-chen opened this issue Mar 25, 2024 · 7 comments
Assignees
Labels
epic A major task that should be broken down into smaller tasks feature Feature proposal website Tasks related to the web application

Comments

@chaoran-chen chaoran-chen added the website Tasks related to the web application label Mar 25, 2024
@chaoran-chen chaoran-chen added this to the MVP (public) milestone Mar 25, 2024
@theosanderson
Copy link
Member

Result

So I'm imagining subheadings for various things.

(unheaded) frontmatter:

  • Isolate name
  • INSDC id
  • Released at
  • Data user terms
  • Country
  • Authors

Sequence info:

  • Length
  • Nucleotide substitutions
  • Nucleotide deletions
  • Amino acid substitutions

Additional metadata:

  • host
  • patient status/etc.

Implementation

I'd be imagining the config yaml would have

detailsPage:
      frontmatter:
            isolate_name
            insdc_id
      Sequence_Info:
            length: 

etc.

@chaoran-chen chaoran-chen moved this to Prioritized in Planning Mar 25, 2024
@theosanderson
Copy link
Member

Also, we should

  • display the sequence by default for small sequences (with useEffect() so it doesn't delay page-load)
  • but display the sequence in a scrollable box, not taking up infinite space
  • display the sequence in FASTA format with a header

@chaoran-chen
Copy link
Member Author

See also #100 from July last year.

@rneher
Copy link

rneher commented Mar 26, 2024

Here are a few ideas on how to improve the sequence page.

Authors

For author lists, we probably want something like journals do:
image

i.e. abbreviated author lists can be expanded on click. I imagine we need the same features for the datasets page. Authors might have an orcid or email associated. So we need something that renders a list of structured author data with optional features like links to orcid etc. Ingested data from NCBI is going to be messy, but a subset of these features will still work.

Host

For the host, we can aggregate information like

  • Ncbi_host: Homo Sapiens
  • Ncbi_host_taxon: 9606
  • Ncbi_is_lab_host:

Into a field that looks like Homo Sapiens (9606) ({surveillance,laboratory,pool}) and links to NCBI Taxonomy data base. There is probably a dictionary to look-up common names which would be very useful (in particular if we target internationalization at some point).

Another group of fields could be on virus, lineage/clade/serotype etc.

INSDC

Yet another group of fields would be INSDC which would be based on the raw data in

  • Insdc accession base: OR084932
  • Insdc version: 1
  • INSDC accession: OR084932.1
  • NCBI_release_date: 2022-02-15
  • SRA accession
  • BioProject

I'd imagine a header INSDC and them something like

  • OR084932 (version 1, released on 2022-02-15)
  • SRA: unknown
  • BioProject: XXXXXX

Alignment states and QC metrics.

There will be several quality metrics like

  • completeness
  • mixed-sites
  • stop/frameshifts

And things like alignment length. The LANL HIV database for example includes little previews like this
image

(they actually put these into the table to search and browse).

Mutations, insertions, deletions

For mutations, I would follow a similar approach to authors: truncated lists that by default only span one line. Mutations could be rendered as little badge which makes them easier to parse than plain text C87665T. One line could be nucleotide mutations, then one line for each for each gene/CDS. This way uses can quickly find mutations in a particular gene (most of the time, people only care about a specific gene. Alternatively, the amino acid mutations could have a drop down in which you select the gene of interest (with a sensible default for each pathogen).

Insertions and deletions can be handled similarly, though they are typically fewer.

@corneliusroemer
Copy link
Contributor

This PR is quite a good template for similar improvements - it shows how to pipe through new config options from values.yaml (kubernetes) to website: https://github.com/loculus-project/loculus/pull/1442/files

@anna-parker
Copy link
Contributor

anna-parker commented Apr 17, 2024

I started looking into this, from a short discussion with @corneliusroemer and @bh-ethz it would appear best to split this milestone into a couple sub-tasks.

  • Allow the metadata to be split into subsections with (optional) subheadings and further display options for individual subsections
  • Format authors lists using ORCID
  • Add links to the NCBI and INSDC data bases
  • Display sequences e.g. with alignment states and QC metrics in a scrollable form (in FASTA format with a header) with useEffect()
  • Display mutations, insertions and deletions from the reference sequence in more parsable format (see Richard's suggestions: Improve sequence details page's UI #1465 (comment)). (This will potentially need to be further split into subtasks.)
  • Show originally submitted data somehow, e.g. in tooltip

Update: Added the tasks to the description to use github's subtask feature.

@corneliusroemer
Copy link
Contributor

Great idea to split it up in chunks! I've added an extra list item to show originally submitted data somehow, e.g. in tooltip. I think this is something @emmahodcroft suggested. We always process user submitted metadata, it can stay unchanged but in general we might reformat, so it's good to have the original data around to make the processing transparent.

@chaoran-chen chaoran-chen added the epic A major task that should be broken down into smaller tasks label May 19, 2024
@anna-parker anna-parker moved this from Prioritized to In Progress in Planning Jun 28, 2024
@chaoran-chen chaoran-chen modified the milestones: MVP, MVP (nice to have) Jul 4, 2024
@chaoran-chen chaoran-chen moved this from In Progress to Backlog in Planning Jul 22, 2024
@chaoran-chen chaoran-chen added the feature Feature proposal label Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic A major task that should be broken down into smaller tasks feature Feature proposal website Tasks related to the web application
Projects
Status: Backlog
Development

No branches or pull requests

6 participants