The RNA Galaxy workbench is a comprehensive set of analysis tools and consolidated workflows. The workbench is based on the Galaxy framework, which guarantees simple access, easy extension, flexible adaption to personal and security needs, and sophisticated analyses independent of command-line knowledge. The workbench is described in two manuscripts published in Nucleic Acid Research (see version1, version2).
The current implementation comprises more than 50 bioinformatics tools dedicated to different research areas of RNA biology, including RNA structure analysis, RNA alignment, RNA annotation, RNA-protein interaction, ribosome profiling, RNA-Seq analysis, and RNA target prediction.
The workbench is developed by the RNA Bioinformatics Center (RBC). This center is one of the eight service units of the German Network for Bioinformatics Infrastructure, running the German ELIXIR Node.
The RNA analyses workbench implements a webserver based on the Galaxy Docker platform: a dedicated Galaxy instance wrapped in a Docker container. For advanced local deployments, we recommend to check out the upstream documentation. The workbench is directly use and testable as instance of usegalaxy.eu rna.usegalaxy.eu.
To use the Galaxy RNA workbench, you only need Docker, which can be installed in different ways, depending on the type of system you're running:
- non-linux users are encouraged to use Kitematic, which provides a Docker installation for OSX or Windows, coupled with a user friendly interface to run Docker containers;
- linux users and people familiar with the command line can follow the instruction on installing Docker from its website.
The RNA workbench docker container is rather large and expected to grow when further tools and workflows are contributed. So for users new to docker, we list here some tweaks that can help to work around issues when first using docker. After successful installation of docker, it is recommended to configure some settings, dealing for example with the storage space required by containers. You can find more information here.
Whether you run Docker images using Kitematic or the command line interface, the procedure to launch the RNA workbench varies.
Kitematic users can launch the RNA workbench directly from its interface. The following video shows how to load the docker container that is necessary to use the workbench:
For non-Kitematic users, starting the RNA workbench is analogous to start the generic Galaxy Docker image:
$ docker run -d -p 8080:80 quay.io/bgruening/galaxy-rna-workbench
A detailed discussion of Docker's parameters is given in the Docker manual. It is really worth reading. Nevertheless, here is a quick rundown:
-
docker run
starts the Image/ContainerIn case the Container is not already stored locally, docker downloads it automatically
-
The argument
-p 8080:80
makes the port 80 (inside of the container) available on port 8080 on your hostInside the container a Apache web server is running on port 80 and that port can be bound to a local port on your host computer. With this parameter you can access your Galaxy instance via
http://localhost:8080
immediately after executing the command above -
quay.io/bgruening/galaxy-rna-workbench
is the Image/Container name, that directs docker to the correct path in the docker index -
-d
will start the docker container in Daemon mode.For an interactive session, one executes:
$ docker run -i -t -p 8080:80 quay.io/bgruening/galaxy-rna-workbench /bin/bash
and manually invokes the
startup
script to start PostgreSQL, Apache and Galaxy.
Docker images are "read-only". All changes during one session are lost after restart. This mode is useful to present Galaxy to your colleagues or to run workshops with it.
To install Tool Shed repositories or to save your data, you need to export the calculated data to the host computer. Fortunately, this is as easy as:
$ docker run -d -p 8080:80 -v /home/user/galaxy_storage/:/export/ quay.io/bgruening/galaxy-rna-workbench
Given the additional -v /home/user/galaxy_storage/:/export/
parameter, docker will mount the folder /home/user/galaxy_storage
into the Container under /export/
. A startup.sh
script, that is usually starting Apache, PostgreSQL and Galaxy, will recognize the export directory with one of the following outcomes:
- In case of an empty
/export/
directory, it will move the PostgreSQL database, the Galaxy database directory, Shed Tools and Tool Dependencies and various configure scripts to /export/ and symlink back to the original location. - In case of a non-empty
/export/
, for example if you continue a previous session within the same folder, nothing will be moved, but the symlinks will be created.
This enables you to have different export folders for different sessions - meaning real separation of your different projects.
It will start the Galaxy RNA workbench with the configuration and launch of a Galaxy instance and its population with the needed tools. The instance will be accessible at http://localhost:8080.
For a more specific configuration, you can have a look at the documentation of the Galaxy Docker Image.
The Galaxy Admin User has the username admin@galaxy.org
and the password admin
.
In order to use certain features of Galaxy, like e.g. the RNA structure visualization, one has to be logged in.
Also the installation of additional tools requires a login.
The PostgreSQL username is galaxy
, the password galaxy
and the database name galaxy
.
If you want to create new users, please make sure to use the /export/
volume. Otherwise your user will be removed after your docker session is finished.
The RNA workbench provides the possibility to run interactive tours that illustrate how the main interface works in relation to real-life user tasks. These show many common operations, such as searching, parametrizing, and running tools, or saving a history of operations in a sharable workflow.
The following video demonstrates the main elements that compose the Galaxy user interface:
In this section we list all tools that have been integrated in the RNA workbench. The list is likely to grow as soon as further tools and workflows are contributed. To ease readability, we divided them into categories.
Tool | Description | Reference |
---|---|---|
antaRNA | Possibility of inverse RNA structure folding and a specification of a GC value constraint | Kleinkauf et al. 2015 |
CoFold | A thermodynamics-based RNA secondary structure folding algorithm | Proctor et al. 2013 |
CMCompare | Tool to compare RNA families via covariance models | Eggenhofer et al. 2013 |
Kinwalker | Algorithm for cotranscriptional folding of RNAs to obtain the min. free energy structure | Geis et al. 2008 |
MEA | Prediction of maximum expected accuracy RNA secondary structures | Amman et al. 2013 |
RNAlien | A tool for RNA family model construction | Eggenhofer et al. 2016 |
RNAshapes | Structures to a tree-like domain of shapes, retaining adjacency and nesting of structural features | Janssen et al. 2014 |
RNAz | Predicts structurally conserved and therm. stable RNA secondary structures in mult. seq. alignments | Gruber et al. 2010 |
segmentation-fold | An application that predicts RNA 2D-structure with an extended version of the Zuker algorithm | |
ViennaRNA | A tool compilation for prediction and comparison of RNA secondary structures | Lorenz et al. 2011 |
Tool | Description | Reference |
---|---|---|
CMV | RNA family model visualisation | Eggenhofer et al. 2018 |
Compalignp | An RNA counterpart of the protein specific "Benchmark Alignment Database" | Wilm et al. 2006 |
LocARNA | A tool for multiple alignment of RNA molecules | Will et al. 2012 |
MAFFT | A multiple sequence alignment program for unix-like operating systems | Katoh and Standley 2016 |
RNAlien | A tool for RNA family model construction | Eggenhofer et al. 2016 |
Tool | Description | Reference |
---|---|---|
ARAGORN | A tool to identify tRNA and tmRNA genes | Laslett et al. 2004 |
FuMa (Fusion Matcher) | A tool to reports identical fusion genes based on gene-name annotations | Hoogstrate et al. 2015 |
GotohScan | A search tool to find shorter sequences in large database sequences | Hertel et al. 2009 |
Infernal | Suite of tools for building RNA families covariance models (CMs) from structurally annotated sequence alignments | Nawrocki et al. 2013 |
RNABOB | A tool for fast pattern matching of RNA secondary structures | Gautheret et al. 1990 |
RNAcode | Predicts protein coding regions in a set of homologous nucleotide sequences | Washietl et al. 2011 |
tRNAscan | Searches for tRNA genes in genomic sequences | Lowe et al. 1997 |
RCAS | A generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments | Uyar et al. 2017 |
Tool | Description | Reference |
---|---|---|
AREsite2 | A database for AU-/GU-/U-rich elements in human and model organisms | Fallmann et al. 2015 |
doRiNA | A database of RNA interactions in post-transcriptional regulation | Blin et al. 2014 |
PARalyzer | An algorithm to generate a map of interacting RNA-binding proteins and their targets | Corcoran et al. 2011 |
Piranha | A peak-caller for CLIP- and RIP-seq data |
Tool | Description | Reference |
---|---|---|
IntaRNA | Efficient RNA-RNA interaction prediction incorporating accessibility and seeding of interaction sites | Mann et al. 2017 |
Tool | Description | Reference |
---|---|---|
TargetFinder | A tool to predict small RNA binding sites on target transcripts from a sequence database | Fahlgren et al. 2009 |
Tool | Description | Reference |
---|---|---|
RiboTaper | An analysis pipeline for Ribo-Seq experiments, exploiting the triplet periodicity of ribosomal footprints to call translated regions | Calviello et al. 2015 |
Tool | Description | Reference |
---|---|---|
FastQC! | A quality control tool for high throughput sequence data | |
mQC | A quality control tool for ribosome profiling mapping results | Verbruggen and Menschaert 2017 |
MultiQC | A tool to create reports visualising output from multiple tools across many samples | Ewels et al. 2016 |
Trim Galore! | A tool for the automation of quality and adapter trimming on paired-end or non-paired-end end sequences |
Tool | Description | Reference |
---|---|---|
Dr. Disco | An analysis pipeline to detect genomic breakpoints in RNA-Seq data | |
FlaiMapper | A tool for computational annotation of small ncRNA-derived fragments using RNA-seq data | Hoogstrate et al. 2014 |
NASTIseq | A method that incorporates the inherent variable efficiency of generating perfectly strand-specific libraries | Li et al. 2013 |
PIPmiR | An algorithm to identify novel plant miRNA genes from a combination of deep sequencing data and genomic features | Breakfield et al. 2011 |
SortMeRNA | A tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and -genomic data | Kopylova et al. 2011 |
Tool | Description | Reference |
---|---|---|
Bowtie2 | Fast and sensitive read alignment | Langmead et al. 2012 |
BWA | Burrows-Wheeler Aligner for mapping low-divergent sequences against a large reference genome | Li and Durbin 2010 |
BWA-MEM | Fast and accurate long-read alignment with Burrows-Wheeler transform | Li et al. 2010 |
HISAT2 | Hierarchical indexing for spliced alignment of transcripts | Kim et al. 2015 |
RNA STAR | Rapid spliced aligner for RNA-seq data | Dobin et al. 2013 |
STAR-fusion | Fast fusion gene finder | Haas et al. 2017 |
Tool | Description | Reference |
---|---|---|
Trinity | De novo transcript sequence reconstruction from RNA-Seq | Haas et al. 2013 |
Tool | Description | Reference |
---|---|---|
featureCounts | Ultrafast and accurate read summarization program | Liao et al. 2014 |
Sailfish | Rapid alignment-free quantification of isoform abundance | Patro et al. 2014 |
Salmon | Fast, accurate and bias-aware transcript quantification | Patro et al. 2017 |
Tool | Description | Reference |
---|---|---|
DESeq2 | Differential gene expression analysis based on the negative binomial distribution | Love et al. 2014 |
Tool | Description | Reference |
---|---|---|
SAMtools | Utilities for manipulating alignments in the SAM format | Heng et al. 2009 |
BEDTools | Utilities for genome arithmetic | Quinlan et al. 2010 |
deepTools | A suite of tools for exploring hight-throughput sequencing data (HTS), such as ChIP-seq, RNA-seq, and MNase-seq | Ramirez et al. 2016 |
To learn about RNA sequencing data analysis, we recommend you to have a look at the training material from the Galaxy Training network, particularly the tutorial on Reference-based RNA-seq data analysis.
In the Galaxy RNA workbench, we also included Galaxy interactive tours to guide you through the Galaxy, it's tools and possibilities.
- Andrea Bagnacani
- Bérénice Batut
- Florian Eggenhofer
- Joerg Fallmann
- Bjoern Gruening
- Youri Hoogstrate
- Torsten Houwaart
- Cameron Smith
- Pavankumar Videm
- Sebastian Will
- Markus Wolfien
- Dilmurat Yusuf
The RNA-workbench community welcomes new contributions and help in any way. We have collected detailed instructions and some guidance in our CONTRIBUTING.md.
For support, questions, or feature requests fill bug reports on our issue page.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.