-
Notifications
You must be signed in to change notification settings - Fork 0
2. Run a test
After you have installed TALLSorts properly, you should run a test to ensure it works.
-
Navigate to the root folder containing TALLSorts. Within this folder, you should see the folder
tests
-
Activate the
tallsortsenv
environment using eitherconda activate tallsortsenv
or
source activate tallsortsenv
-
Run the following command:
TALLSorts -s tests/test_counts.csv -d test_output
This command calls
TALLSorts
, takes in the RNA-seq counts matrixtests/test_counts.csv
, then outputs the classifier results into thetest_output
folder, which was created in the directory you are currently in.(Note that
tests/test_counts.csv
is a counts matrix with samples as rows and genes as columns. Gene names are in Ensembl format (eg. "ENSG00000000003"). We will add support for gene symbol and Entrez IDs in a later update. -
Navigate to
test_output
and view the following results:-
probabilities.csv
: a table containing the subtype classification probabilities for each sample -
predictions.csv
: a list reporting the most likely subtype for each sample -
multi_calls.csv
: for samples that have multiple predicted subtypes, this csv reports on all those extra predictions -
prob_scatters.html
: essentially an interactive plot ofprobabilities.csv
. A frozen image is given byprob_scatters.png
. -
waterfalls.html
: an interactive plot ofpredictions.csv
which also shows the probability of predicted subtypes.. A frozen image is given bywaterfalls.png
.
-
RNA-seq counts used in tests/test_counts.csv
were generated from publicly-available RNA-seq reads published by:
Autry RJ, Paugh SW, Carter R, Shi L, Liu J, Ferguson DC, et al. Integrative genomic analyses reveal mechanisms of glucocorticoid resistance in acute lymphoblastic leukemia. Nature Cancer. 2020;1(3):329-44. | Link to paper | Link to data here and here
Sample names were changed to random strings to preserve anonymity.