Pipeline APA

Author:	Ian Sudbery
Release:	0.01
Date:	21/11/16
Tags:	Python

Overview

This pipeline aims to identify cases of alternate exon useage from RNAseq data. It uses two different appraoched. The DaPars program will be applied, which bulids models of read depth over final exons to identify cases of APA. The second is to use DEXSeq to identify cases of alternate exon usage where the exon in question is the last exon in a transcript.

Usage

See :ref:`PipelineSettingUp` and :ref:`PipelineRunning` on general information how to use CGAT pipelines.

Configuration

The pipeline requires a configured :file:`pipeline.ini` file.

Default configuration files can be generated by executing:

python <srcdir>/pipeline_apa.py config

By default the pipeline will try to guess the experimental design but a design file can be provided, called :file:`design.tsv` to contain a different design. The file has three columns, a column with the comparison name, and two columns with regular expressions that match file in condition1 and condition2 respectively. e.g:

#name    pattern1           pattern2
tissue   heart-control.+    brain-control.+
kd       heart-kd.+         heart-control.+

If a design file is not present, files with control in the second part of the file name will be matched as controls for those with same first part, but different second part.

e.g.

if heart-control-r1 and heart-kd-r1 are present, the first will be used as the control for the second.

Input files

The input files are indexed bam files, named with three part names, seperated by a dash. Traditionally part 1 is the tissue or cell type, or experiment name, part 2 is the condition, and part 3 is the replicate. e.g.

heart-control-R1.bam

would be the heart control from replicate one.

Requirements

The pipeline requires the results from :doc:`pipeline_annotations`. Set the configuration variable :py:data:`annotations_database` and :py:data:`annotations_dir`.

On top of the default CGAT setup, the pipeline requires the following software to be in the path:

Requirements:

samtools >= 1.1
DaPars
R
DEXSeq
ExperimentR
bedtools
bgzip & tabix

Pipeline output

Most of the output is in the sqlite database associated with the pipeline (csvdb by default). Also exported are the last exon chunks found to be differentially used by DEXSeq in the export directory.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
pipeline_apa		pipeline_apa
pipeline_docs		pipeline_docs
README.rst		README.rst
pipeline_apa.py		pipeline_apa.py
pipeline_diagram.png		pipeline_diagram.png
run_dexseq_all.R		run_dexseq_all.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline APA

Overview

Usage

Configuration

Input files

Requirements

Pipeline output

Diagram

About

Releases

Packages

Contributors 2

Languages

sudlab/pipeline_apa

Folders and files

Latest commit

History

Repository files navigation

Pipeline APA

Overview

Usage

Configuration

Input files

Requirements

Pipeline output

Diagram

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages