-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Version: 17.0.1
Create: August 4, 2017
Update: August 18, 2017
Author: Nan Zhou
Maintain: Nan Zhou
visnormsc is a graphical user interface (GUI) for normalization of single-cell RNA sequencing (RNA-seq) data. It was developed using python so it is a cross-platform GUI program for main operating systems including WindowsTM, GNU/Linux, and macOSTM.
The easiest way to install Python and essential dependencies visnormsc requires is to install the latest version of Anaconda for the platform you are using. Installation of Anaconda on WindowsTM, GNU/Linux, and macOSTM will be documented bellow.
WindowsTM
- Download the graphical installer of the latest Anaconda distribution of Python >= 3.5 for WindowsTM
- Install Anaconda on Windows
- At the last few steps, be sure to tick something like
- Register Anaconda as my default Python 3.6
GNU/Linux
- Download the GNU/Linux installer of the latest Anaconda distribution of Python >= 3.5
- Install Anaconda on GNU/Linux
- If Anaconda was installed in ~/anaconda3 then you could use the default to way to run visnormsc
macOSTM
- Download the graphical installer of the latest Anaconda distribution of Python >= 3.5 for macOS
- Install Anaconda on macOS
- If Anaconda was installed in ~/anaconda3 then you could use the default to way to run visnormsc
If the default way couldn't work for you, please use the custom way as described below.
Download or clone visnormsc from the GitHub page to a local space.
Run on WindowsTM
- The default way
- Go to the directory of visnormsc, i.e., "/path/to/visnormsc/" in Windows Explorer
- Go the the sub-directory "bin" in the "visnormsc" directory
- Double click "onWindows.bat" to run visnormsc
- The custom way
- Open the CMD prompt using "Start -> All Programs -> Accessories -> Command Prompt"
- Type the command
/path/to/python /path/to/visnormsc/visnormscGUI.py
to run
Run on GNU/Linux and macOSTM
- The default way
- Open Terminal
- Go to the directory of visnormsc using command
cd /path/to/visnormsc
- Run command
bash onLinuxAndmacOS.sh
- The custom way
- Open Terminal
- Type the command
/path/to/python /path/to/visnormsc/visnormscGUI.py
to run
Prepare input data
The input data should be in csv (comma separated values) format. CSV files can be edited and saved using flat text editor, Microsoft Office ExcelTM, LibreOffice Calc and other spreadsheet softwares.
- Single-cell RNA-seq data
- A csv file. The first row shows column names and the first column shows row names. Each column represents a cell and each row represents a gene. Other values can be regarded as a G-by-S matrix, where G (should be > 100) is the number of genes and S is the number of single cells. This matrix should contain estimates of gene expression. Counts of this nature may be obtained from RSEM, HTSeq, Cufflinks, Salmon or a similar approach.
- A_example_input_data.csv of a 7-by-5 value matrix:
Cell_1 | Cell_2 | Cell_3 | Cell_4 | Cell_5 | |
---|---|---|---|---|---|
Gene_1 | 9 | 2 | 16 | 0 | 4 |
Gene_2 | 4.98 | 2.99 | 2.28 | 0 | 3.2 |
Gene_3 | 0 | 0 | 0 | 0 | 0 |
Gene_4 | 4 | 11 | 1 | 1 | 2 |
Gene_5 | 82 | 65 | 110 | 308.52 | 71 |
Gene_6 | 0 | 3.72 | 4.53 | 0 | 0 |
Gene_7 | 9 | 0 | 0 | 0 | 0 |
- Cell condition data
- It is also a csv file but only includes a single un-named column. It shows what condition each cell in the input data belongs to. Each row is a reflection of a cell in the input data (e.g. column names in the example above). Generally the definition of condition will be obvious given the experimental setup.
- A_example_condition_file.csv corresponding to A_example_input_data.csv:
cond1 |
cond1 |
cond2 |
cond3 |
cond3 |
where Cell_1 and Cell_2 belong to cond1, Cell_3 belongs to cond2, Cell_4 and Cell_5 belong to cond3.
Check count-depth relationship in RNA-seq data
Parameters for the Check operation are:
Data: A single-cell RNA-seq data file in csv format.
Normalized data: Default NO. If YES, the input data should have been normalized either by visnormsc or other methods.
Conditions: A csv file. Conditions of cells in the data file.
Tau: The quantile for quantile regression. 0 < float < 1
Filter cell proportion: The proportion of non-zero expression estimates required to include the genes into the evaluation. 0 <= float <= 1
Filter expression: Exclude genes having median of non-zero expression below this threshold. A real number.
Number of expression grups: Split the RNA-seq data into this number of equally sized groups. An integer > 0
NCores: Number of CPU cores to be used. None or a integer > 0
Normalize single-cell RNA-seq data
Parameters for the Normalize operation are:
Data: A single-cell RNA-seq data file in csv format.
Conditions: A csv file. Conditions of cells in the data file.
Proportion of genes: The proportion of genes closest to the slope mode used for the group fitting. 0 < float < 1
Tau: The quantile for quantile regression. 0 < float < 1
Filter cell number: The number of non-zero expression values required to include the genes into model fitting. An integer > 0
K: The number of gene groups in cells of the same condition. Default is None, which means visnormsc will automatically find the best value from 1. K can also be set by user in form of condition name: integer > 0
. The condition names should match those given in the condition file. Taking the aforementioned example RNA-seq data and conditions as an example here, K can be set using cond1: 10, cond2: 6, cond3: 8
, which means cond1 has 10 groups, cond2 has 6 groups and cond3 has 8 groups.
Save evaluation plots: Save figures of evaluating K.
NCores: Number of CPU cores to be used, None or a integer > 0
Filter expression: Genes having median of non-zero expression below this threshold will be excluded from the model fitting. A real number.
Thresh: A threshold used for evaluating K. A real number.
The demo data is in "/path/to/visnormsc/test/testData". Simulated single-cell RNA-seq (exampleData.csv) and cells' conditions (exampleDataConditions.csv). Real single-cell RNA-seq (scH1data.csv) and cells' conditions (scH1dataConditions.csv). These data were from Bacher and colleagues study (Bacher et al., 2017).
The demo for simulated data can be successfully reimplemented by selecting data file exampleData.csv and condition file exampleDataConditions.csv, and keeping other settings unchanged.
The demo for real data can be successfully reimplemented by selecting data file scH1data.csv and condition file scH1dataConditions.csv, and keeping other settings unchanged.
Bacher, Rhonda, et al. "SCnorm: robust normalization of single-cell RNA-seq data." nature methods 14.6 (2017): 584-586.