-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(docs): add more important parts of the arc42 docs #535
- Loading branch information
1 parent
fa70ac6
commit 5377ab9
Showing
16 changed files
with
195 additions
and
44 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
16 changes: 16 additions & 0 deletions
16
.../src/content/docs/architecture-and-dev-docs/02-architecture-and-constraints.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
--- | ||
title: Architecture and Constraints | ||
description: Overview of the architecture and constraints of the software. | ||
--- | ||
|
||
We identified the following constraints for our software: | ||
|
||
- Developed under an **open-source** licence. | ||
We chose the tooling such that a broad spectrum of developers can in principle work on the software. | ||
- The software is designed to be **highly configurable** so that it can be used for various organisms. | ||
Configuration files have to be passed to LAPIS and SILO at runtime that determine the nature of the organism such as: | ||
- a reference genome | ||
- which metadata is available on the genomic data | ||
- The system is designed to have the best possible **performance**. | ||
This mostly targets SILO, but also in LAPIS, | ||
we have to keep in mind that we are dealing with potentially large data that we have to serve to the client. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
58 changes: 58 additions & 0 deletions
58
lapis2-docs/src/content/docs/architecture-and-dev-docs/04-solution-strategy.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
--- | ||
title: Solution Strategy | ||
description: How LAPIS and SILO aim to solve the problem | ||
--- | ||
|
||
## Setting Up Your Own Instance | ||
|
||
We want to make it as easy as possible for you to set up your own instance of SILO-LAPIS for an organism of your | ||
choice. | ||
We solve this in two aspects: | ||
|
||
- **Configuration:** LAPIS and SILO are highly configurable regarding the data that they process. | ||
The available data and the reference genome can be configured to fit your needs. | ||
- **Deployment:** We provide Docker containers for LAPIS and SILO that are ready to use. | ||
You only need to provide the data and the configuration. | ||
We also provide examples and tutorials to help you get started. | ||
|
||
## Query Performance | ||
|
||
LAPIS and SILO are designed to process queries as fast as possible. | ||
One should be able to search for mutations in millions of samples in a matter of seconds. | ||
|
||
SILO contains an in-memory database that holds the data. | ||
The data is stored column-wise in bitmaps, | ||
since the nature of most queries targets columns. | ||
|
||
Example: A common query is to search for a mutation at a certain position in the genome. | ||
SILO stores each position in the genome as a separate column, | ||
thus the filter becomes trivial (reading the respective precomputed bitmap). | ||
The bitmap is interpreted as the filter result (having a `1` in the positions of the samples that match the filter). | ||
|
||
### Preprocessing | ||
|
||
Precomputing the bitmaps is a time-consuming task. | ||
SILO does this ahead of time in a separate step, the preprocessing. | ||
The preprocessing is a separate part of SILO that builds the in-memory database from the input files | ||
and serializes it to disk. | ||
At runtime, SILO can then load the serialized database from disk. | ||
Having the preprocessing as a separate step has major advantages: | ||
|
||
- The preprocessing can be done on a different machine than the one that runs the queries. | ||
- The startup time of SILO is reduced, since it only needs to load the database from disk. | ||
- Scalability: Thus, it is possible to quickly launch several instances of SILO from the same preprocessing result. | ||
|
||
## Storage Efficiency | ||
|
||
SILO uses [Roaring bitmaps](https://roaringbitmap.org/) to store the data, | ||
since they are designed to be space-efficient. | ||
Internally, Roaring bitmaps store data in chunks. | ||
SILO aims to sort sequences such that | ||
similar sequences (i.e. sequences that have similar mutations) are stored in the same chunk. | ||
The goal is to have many bitmaps that are either almost completely empty or almost completely full. | ||
This will result in a very high compression ratio. | ||
|
||
## Easy Access To Data | ||
|
||
SILO offers a rather complex query language to query the data. | ||
LAPIS aims to simplify the usage of SILO by providing a simple REST API. |
11 changes: 11 additions & 0 deletions
11
lapis2-docs/src/content/docs/architecture-and-dev-docs/05-building-block-view.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
--- | ||
title: Building Block View | ||
description: A view into SILO and LAPIS | ||
--- | ||
|
||
The system consists of two artifacts: | ||
|
||
- LAPIS: A simple REST API. | ||
- SILO: A more detailed view into SILO is depicted below. | ||
|
||
![Building Block View](../../../plantuml/building-block-view.svg) |
21 changes: 21 additions & 0 deletions
21
lapis2-docs/src/content/docs/architecture-and-dev-docs/06-runtime-view.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
--- | ||
title: Runtime View | ||
description: Building Blocks And How They Interact At Runtime | ||
--- | ||
|
||
SILO-LAPIS consists of three main components: | ||
|
||
- **LAPIS:** A web service wrapping the SILO API. | ||
- It maps the request to a corresponding SILO query. | ||
- **SILO API:** The query engine exposed as a web service. | ||
- It accepts **SILO queries** and returns the results. A SILO query specifies | ||
- a filter expression for which samples should be considered, | ||
- an action what kind of data should be returned (details, aggregated data, etc.). | ||
- The SILO API regularly checks for new serialized states of the database (the output of the preprocessing) | ||
and loads them into memory. | ||
- **SILO Preprocessing:** A command line tool that preprocesses the data for SILO. | ||
It builds a database from the input data and serializes it to disk. | ||
- The SILO Preprocessing has to be started by the maintainer of the instance (or e.g. a cronjob). | ||
It is not a continuously running process. | ||
|
||
![Runtime View](../../../plantuml/runtime-view.svg) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
21 changes: 0 additions & 21 deletions
21
...ocs/src/content/docs/architecture-and-dev-docs/architecture-and-constraints.mdx
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
plantuml.jar |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Generating the PlantUML diagrams | ||
|
||
Download `plantuml.jar` from the following link and place it in this directory. | ||
|
||
<https://github.com/plantuml/plantuml/releases/tag/v1.2023.13> | ||
|
||
run | ||
```bash | ||
java -jar plantuml.jar -tsvg ./*.puml | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
@startuml | ||
|
||
node SILO { | ||
package "SILO Api" { | ||
component "Query Engine" as query | ||
component "Runtime Database" as db | ||
"Web API" -> query | ||
query -> db | ||
} | ||
|
||
package "SILO Preprocessing" { | ||
component "Preprocessing Database" | ||
} | ||
} | ||
|
||
@enduml |
Oops, something went wrong.