-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
40 lines (30 loc) · 2.06 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
rna-seq has few dependencies, depending on the genome aligner that is being used.
BWA, Bowtie, Bowtie2 or LISA can be used, but they have to be exported to the PATH variable then.
rna-seq is a tool for mapping rna reads against its genome, without knowing where the exon regions in genome initially are.
run 'make' to compile and generate the rna-seq binary
Usage: rna-seq <aligner> <genome.fa> <index_dir> <reads.fq> <result_file>
<aligner> - 'bowtie', 'bowtie2', 'bwa' or 'lisa'
<genome.fa> - fasta file containing genome
<index_dir> - prefix used for aligners for storing temporary files
<reads.fq> - fastq file with reads to be aligned
<result_file> - file in sam format containing overall alignments
options:
--cov-limit [coverage-limit-number] - potential exons with coverage limit lower than specified are discarded
--skip-building-index - when running on same genome index again no need to wait for index building
--res-cig [res.cig] - file in cig format for validating results
--sliding-window-size/-sws [size] - size of the granulation for cutting initially unmatched reads into smaller chunks - used for spliced matching
--join-exons-threshold/-jet [size] - if two exons are too close join them (default: 100)
--extend-exons-threshold/-eet [size] - how much to extend exons on right and left side for finding reads that were spliced (default:50)
--get-histogram-data - creates a hist.in file showing coverage data before spliced alignment
--num-threads [num] - number of threads for spliced alignment
/src
contains main cpp code
/scripts
contains various scripts used for testing and data preparation
all the scripts have a --help option for more information
few of them are written in cpp so compilation is necessary
/lisa
contains various lisa related binaries, currently lisa is dependant on divsufsort and gflags that need to be in LD_LIBRARY_PATH.
Lisa is being developed independently of rna-seq, more info can be found here https://github.com/Zuza/dtra
/bin
contains binaries