Skip to content

Database mode

AlexanderGress edited this page Nov 10, 2020 · 5 revisions

The default configuration of StructMAn is the so-called lite mode. Alternatively, one can activate the database mode with the -d flag. Then the pipeline locally stores processed data and results. Thus, we expect some overhead here for data the pipeline never has processed, but it will amortize when the pipeline is frequently used. In the following, we investigate some examples to better understand the differences between the lite mode and the database mode.


In the first example, we process P53 using the lite mode:
structman.py -i P53_HUMAN
The completion of this command took 10 minutes and 19 seconds.
Subsequently, we want to process one specific mutation in P53, so we call: structman.py -i P53_HUMAN T140G
This time StructMAn returns after 8 minutes and 45 seconds.
Now, let's repeat the same examples using the database mode. Since the database mode is compatible with single-line inputs, we have to create two input files here: p53.smlf and p53_t140g.smlf, which contain the same input as given the two examples above using the lite mode with single-line inputs.
structman.py -i structman/input_data/p53.smlf -d
The first command now took exactly 14 minutes. So the database mode comes with an overhead for unseen data. But now calling the second command:
structman.py -i structman/input_data/p53_t140g.smlf -d
Which returns after 16 seconds. Showing that known data amortize well for the overhead of unseen data.

Clone this wiki locally