GitHub - BioSystemsUM/bRNAsPipe: Pipeline for RNAseq analysis in Bash and scripts for microarray gene expression analysis in R

Pipeline for RNAseq and Microarray analysis

Pipeline and scripts used for raw microarray and RNAseq data analysis in "Tânia Barata, Vítor Vieira, Rúben Rodrigues, Ricardo Pires das Neves, Miguel Rocha, Reconstruction of tissue-specific genome-scale metabolic models for human cancer stem cells, Computers in Biology and Medicine, Volume 142, 2022, 105177, ISSN 0010-4825, https://doi.org/10.1016/j.compbiomed.2021.105177"

RNAseq

Pipeline for RNAseq was developed in Bash and it uses docker containers. Requirements to run are: Linux system and Podman. It is recommended to use ensembl annotation and ensembl genome reference files.

Example annotation file: Homo_sapiens.GRCh38.99.gtf.gz
Example genome reference fasta: Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

Fill Studies_RNAseq.txt with your studies info. Studies_RNAseq.txt is a tab-delimited file, under data directory. Columns:

Study is the study identifier
SampleId is sample identifier
Reads has '1' or '2' to destinguish between foward and reverse reads, single-end studies have 'Unpaired'
Link is the link to fastq.gz file.
All other columns should be filled with 'NA' when there are no values.

Move to rnaseq scripts folder: mv scr/bash
Edit base folder path and URLs of genome and annotation files in scr/bash/Edit
Download genome ref and annotation files by doing: ./Dirs.sh
Download files of a study with: DownloadFiles.sh <Study>
Confirm if files finished to download: ps -e | grep <jobId> To get Job ids of donwloads cd data/<Study>/rawData and do cat PIDs
After all downloads finish, evaluate raw read quality with: ./GetQCfiles.sh <Study>.
After this, manually check fastqc results and decide which contaminants/overrepresented sequences should be removed in each sample and add them to file Seq2RemoveFile in folder data//trimmedData so that Trimmomatic will remove those sequences. If no file is provided, trimmomatic runs without excluding those sequences. Example of Seq2RemoveFile content:

seqname ACTTTTTTTTTTTTTTTTTTT

To define specific trimmomatic parameters for a sample, include a file named TrimParams in directory data/trimmedData where you can change trimmomatic parameters for each sample, if you see for example that reads need to be trimmed in that study. Otherwise, default parameters are run.
To run the rest of the analysis: ./RNAseqAnalysis.sh <Study> Results are in folders inside directory data/

Microarray

To run in Windows OS with R. File with studies info is: Studies_Microarrays.xlsx Run script scr/R/MicroarrayNormalize.R Paths are hardcoded at beginning of the script

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
MetModels		MetModels
data		data
scr		scr
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline for RNAseq and Microarray analysis

RNAseq

Microarray

About

Releases

Packages

Languages

BioSystemsUM/bRNAsPipe

Folders and files

Latest commit

History

Repository files navigation

Pipeline for RNAseq and Microarray analysis

RNAseq

Microarray

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages