Skip to content

Commit 3bce471

Browse files
committed
add mermaid pipelne diagram
1 parent 31fb53e commit 3bce471

File tree

2 files changed

+113
-0
lines changed

2 files changed

+113
-0
lines changed

README.md

+62
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,68 @@ docker run -it --rm --name blobtools -p 8000:8000 -p 8001:8001 genomehubs/blobto
8585

8686
## BlobToolKit pipeline
8787

88+
```mermaid
89+
%%{ init: { 'flowchart': { 'curve': 'step' } } }%%
90+
flowchart TD
91+
Assembly:::inputclass@{ shape: document, label: "Assembly *FASTA*"} --> windowmasker:::processclass
92+
windowmasker --> MaskedAssembly:::fileclass@{ shape: document, label: "Masked assembly *FASTA*"}
93+
MaskedAssembly --> BUSCO:::processclass
94+
BUSCO --> BuscoFullTable:::fileclass@{ shape: documents, label: "BUSCO full table *TSV*"}
95+
MaskedAssembly --> chunk_fasta:::processclass
96+
BuscoFullTable --> chunk_fasta
97+
NT:::inputclass@{ shape: database, label: "NCBI\nnt"} -.-> blastn:::processclass
98+
chunk_fasta --> BuscoRegions:::fileclass@{ shape: documents, label: "BUSCO regions *FASTA*"}
99+
BUSCO --> BuscoSequences:::fileclass@{ shape: documents, label: "BUSCO sequences *FASTA*"}
100+
BuscoSequences --> extract_busco_genes:::processclass
101+
extract_busco_genes --> BuscoGenes:::fileclass@{ shape: documents, label: "BUSCO genes *FASTA*"}
102+
Uniprot:::inputclass@{ shape: database, label: "UniProt\nUniRef 90"} --> blastx[diamond blastx]:::processclass
103+
MaskedAssembly --> minimap2:::processclass
104+
Reads:::inputclass@{ shape: documents, label: "Read *FASTQ*"} --> minimap2
105+
minimap2 --> CRAM:::fileclass@{ shape: documents, label: "mapped reads *BAM*/*CRAM*"}
106+
BuscoRegions --> blastx
107+
BuscoGenes --> blastp[diamond blastp]:::processclass
108+
Uniprot --> blastp
109+
blastx --> blastxOut:::fileclass@{ shape: document, label: "blastx results *TSV*"}
110+
blastp --> blastpOut:::fileclass@{ shape: document, label: "blastp results *TSV*"}
111+
blastxOut --> filter_chunks:::processclass
112+
BuscoRegions --> filter_chunks
113+
filter_chunks -.-> filteredChunks:::fileclass@{ shape: document, label: "no-hit regions *FASTA*"}
114+
blastn -.-> blastnOut:::fileclass@{ shape: document, label: "blastn results *TSV*"}
115+
filteredChunks -.-> blastn
116+
CRAM --> blobtk_depth[blobtk depth]:::processclass
117+
blobtk_depth --> readDepth:::fileclass@{ shape: documents, label: "read coverage depth *BED*"}
118+
BuscoFullTable --> count_busco_genes:::processclass
119+
MaskedAssembly --> fasta_windows:::processclass
120+
count_busco_genes --> BuscoGeneCounts:::fileclass@{ shape: document, label: "BUSCO gene counts *BED*"}
121+
fasta_windows --> KmerStats:::fileclass@{ shape: documents, label: "kmer stats *BED*"}
122+
BuscoGeneCounts --> combine_outputs:::processclass
123+
KmerStats --> combine_outputs
124+
NCBITaxonomy:::inputclass@{ shape: database, label: "NCBI taxonomy"} --> blobtools_create[blobtools create]:::processclass
125+
blastpOut --> blobtools_create
126+
blastxOut --> blobtools_add
127+
blastnOut --> blobtools_add
128+
readDepth --> combine_outputs
129+
combine_outputs --> kbStats:::fileclass@{ shape: document, label: "1kb assembly stats *BED*"}
130+
kbStats --> window_stats:::processclass
131+
window_stats --> windowStats:::fileclass@{ shape: documents, label: "100kb, 1Mb, 1% & 10% window stats *BED*"}
132+
windowStats --> blobtools_create
133+
blobtools_create --> BlobDir:::fileclass@{ shape: documents, label: "Initial *BlobDir*"}
134+
NCBITaxonomy --> blobtools_add
135+
BlobDir --> blobtools_add[blobtools add]:::processclass
136+
BlobDir --> blobtools_filter[blobtools filter --summary]:::processclass
137+
BlobDir --> blobtk_plot[blobtk plot]:::processclass
138+
blobtools_add --> FullBlobDir:::fileclass@{ shape: documents, label: "Complete *BlobDir*"}
139+
blobtools_filter --> FullBlobDir
140+
blobtk_plot --> FullBlobDir
141+
142+
classDef inputclass fill:#f969,stroke-width:4px
143+
classDef processclass fill:#96f9,stroke-width:4px
144+
classDef fileclass fill:#6f99,stroke-width:4px
145+
classDef default stroke-width:4px;
146+
linkStyle default stroke-width:4px;
147+
148+
```
149+
88150
The BlobToolKit pipeline can be run by creating a YAML config file and environment variables to the `genomehubs/blobtoolkit` docker image.
89151

90152
```sh

pipeline.mermaid

+51
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
flowchart TD
2+
Assembly@{ shape: document, label: "Assembly *FASTA*"} --> windowmasker
3+
windowmasker --> MaskedAssembly@{ shape: document, label: "Masked assembly *FASTA*"}
4+
MaskedAssembly --> fasta_windows
5+
MaskedAssembly --> BUSCO
6+
BUSCO --> BuscoFullTable@{ shape: documents, label: "BUSCO full table *TSV*"}
7+
BuscoFullTable --> chunk_fasta
8+
MaskedAssembly --> chunk_fasta
9+
NT@{ shape: database, label: "NCBI\nnt"} -.-> blastn
10+
chunk_fasta --> BuscoRegions@{ shape: documents, label: "BUSCO regions *FASTA*"}
11+
BUSCO --> BuscoSequences@{ shape: documents, label: "BUSCO sequences *FASTA*"}
12+
BuscoSequences --> extract_busco_genes
13+
extract_busco_genes --> BuscoGenes@{ shape: documents, label: "BUSCO genes *FASTA*"}
14+
Uniprot@{ shape: database, label: "UniProt\nUniRef 90"} --> blastx[diamond blastx]
15+
MaskedAssembly --> minimap2
16+
Reads@{ shape: documents, label: "Read *FASTQ*"} --> minimap2
17+
fasta_windows --> KmerStats@{ shape: documents, label: "kmer stats *BED*"}
18+
minimap2 --> CRAM@{ shape: documents, label: "mapped reads *BAM*/*CRAM*"}
19+
BuscoRegions --> blastx
20+
BuscoGenes --> blastp
21+
Uniprot --> blastp[diamond blastp]
22+
blastx --> blastxOut@{ shape: document, label: "blastx results *TSV*"}
23+
blastp --> blastpOut@{ shape: document, label: "blastp results *TSV*"}
24+
blastxOut --> filter_chunks
25+
BuscoRegions --> filter_chunks
26+
filter_chunks -.-> filteredChunks@{ shape: document, label: "no-hit regions *FASTA*"}
27+
blastn -.-> blastnOut@{ shape: document, label: "blastn results *TSV*"}
28+
filteredChunks -.-> blastn
29+
CRAM --> blobtk_depth[blobtk depth]
30+
blobtk_depth --> readDepth@{ shape: documents, label: "read coverage depth *BED*"}
31+
BuscoFullTable --> count_busco_genes
32+
count_busco_genes --> BuscoGeneCounts@{ shape: document, label: "BUSCO gene counts *BED*"}
33+
KmerStats --> combine_outputs
34+
NCBITaxonomy@{ shape: database, label: "NCBI taxonomy"} --> blobtools_create[blobtools create]
35+
blastpOut --> blobtools_create
36+
BuscoGeneCounts --> combine_outputs
37+
blastxOut --> blobtools_add
38+
blastnOut --> blobtools_add
39+
readDepth --> combine_outputs
40+
combine_outputs --> kbStats@{ shape: document, label: "1kb assembly stats *BED*"}
41+
kbStats --> window_stats
42+
window_stats --> windowStats@{ shape: documents, label: "100kb, 1Mb, 1% & 10% window stats *BED*"}
43+
windowStats --> blobtools_create
44+
blobtools_create --> BlobDir@{ shape: documents, label: "Initial *BlobDir*"}
45+
NCBITaxonomy@{ shape: database, label: "NCBI taxonomy"} --> blobtools_add
46+
BlobDir --> blobtools_add[blobtools add]
47+
BlobDir --> blobtools_filter[blobtools_filter --summary]
48+
BlobDir --> blobtk_plot[blobtk plot]
49+
blobtools_add --> FullBlobDir@{ shape: documents, label: "Complete *BlobDir*"}
50+
blobtools_filter --> FullBlobDir
51+
blobtk_plot --> FullBlobDir

0 commit comments

Comments
 (0)