Skip to content

Commit

Permalink
feat(docs): list valid nucleotide and amino acid symbols #573
Browse files Browse the repository at this point in the history
  • Loading branch information
fengelniederhammer committed Jan 16, 2024
1 parent 9158575 commit 35ccf58
Show file tree
Hide file tree
Showing 6 changed files with 72 additions and 29 deletions.
8 changes: 6 additions & 2 deletions lapis2-docs/astro.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,12 @@ export default defineConfig({
link: '/references/open-api-definition/',
},
{
label: 'Reference Genome',
link: '/references/reference-genome/',
label: 'Reference Genomes',
link: '/references/reference-genomes/',
},
{
label: 'Nucleotide And Amino Acid Symbols',
link: '/references/nucleotide-and-amino-acid-symbols/',
},
{
label: 'Database Config',
Expand Down
23 changes: 2 additions & 21 deletions lapis2-docs/src/content/docs/concepts/ambiguous-symbols.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,27 +3,8 @@ title: Ambiguous symbols
description: Explanation how ambiguous reads are handled in the data
---

The underlying sequence files in `.FASTA` format can contain any of the following symbols:

| Symbol | Meaning |
| ------ | ----------------- |
| A | Adenine |
| C | Cytosine |
| G | Guanine |
| T | Thymine |
| - | Deletion |
| N | failed read / any |
| R | A or G |
| Y | C or T |
| S | C or G |
| W | A or T |
| K | G or T |
| M | A or C |
| B | not A |
| D | not C |
| H | not G |
| V | not T |

[The symbols page](/references/nucleotide-and-amino-acid-symbols)
lists all symbols that the underlying sequence files in `.FASTA` format can contain.
The ambiguous symbols arise from imperfect reads in the sequencer.

While one mostly queries for the symbols `A`, `C`, `G`, `T` and `-` to look for specific features and mutations of a sequence,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ The file must contain a JSON object with two keys:

A reference sequence is a JSON object that permits the following keys:

| Key | Type | Required | Description |
| -------- | ------ | -------- | ----------------------------------------- |
| name | string | true | The name of the sequence. Must be unique. |
| sequence | string | true | The sequence. |
| Key | Type | Required | Description |
| -------- | ------ | -------- | ----------------------------------------------------------------------------------------------- |
| name | string | true | The name of the sequence. Must be unique. |
| sequence | string | true | The sequence. See [here for allowed characters](/references/nucleotide-and-amino-acid-symbols). |
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: Nucleotide And Amino Acid Symbols
description: A reference list of valid symbols
---

This page lists valid symbols for nucleotides and amino acids.

## Nucleotides

| Symbol | Meaning | Ambiguous Symbol |
| ------ | ----------------- | ---------------- |
| A | Adenine | |
| C | Cytosine | |
| G | Guanine | |
| T | Thymine | |
| - | Deletion | |
| N | failed read / any | |
| R | A or G ||
| Y | C or T ||
| S | C or G ||
| W | A or T ||
| K | G or T ||
| M | A or C ||
| B | not A ||
| D | not C ||
| H | not G ||
| V | not T ||

## Amino Acids

| Symbol | Meaning | Ambiguous Symbol |
| ------ | --------------------------- | ---------------- |
| - | Deletion | |
| A | Alanine | |
| C | Cysteine | |
| D | Aspartic Acid | |
| E | Glutamic Acid | |
| F | Phenylalanine | |
| G | Glycine | |
| H | Histidine | |
| I | Isoleucine | |
| K | Lysine | |
| L | Leucine | |
| M | Methionine | |
| N | Asparagine | |
| P | Proline | |
| Q | Glutamine | |
| R | Arginine | |
| S | Serine | |
| T | Threonine | |
| V | Valine | |
| W | Tryptophan | |
| Y | Tyrosine | |
| \* | Stop codon | |
| B | Aspartic Acid or Asparagine ||
| Z | Glutamine or Glutamic Acid ||
| X | Alanine ||
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Reference Genome
title: Reference Genomes
description: reference genome
---

Expand Down
3 changes: 2 additions & 1 deletion lapis2-docs/tests/docs.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ const referencesPages = [
'Fields',
'Filters',
'Open API / Swagger',
'Reference Genome',
'Reference Genomes',
'Nucleotide And Amino Acid Symbols',
'Database Config',
];

Expand Down

0 comments on commit 35ccf58

Please sign in to comment.