Note 1: replace whatever is between <>
with the proper value. For example, in <VM.IP>
use your Virtual Machine (VM) IP provided (something like 193.166.24.142
).
Note 2: check the number of CPUs of your VM using htop
(the CPUs available will be displayed at the top as dynamic horizontal bars, numbered sequentially).
The key for both Unix (mgmc.key) and Windows (mgmc.ppk) users is available for download; the password for accessing the Dropbox directory will be available during the course.
In your computer
- Note: The SSH key was produced in Linux using
ssh-keygen -t rsa -f /home/cloud-user/mgmc.key
. More information here.
The key for both Unix (mgmc.key) and Windows (mgmc.ppk) users is available for download; the password for accessing the Dropbox directory will be available during the course.
Using terminal
# Give user read only permission to the SSH key
chmod 400 </path/to/provided/private/ssh/key/mgmc.key>
ssh -i </path/to/provided/private/ssh/key/mgmc.key> cloud-user@<VM.IP>
Using PuTTY
- Note 1: use
putty.exe
andputtygen.exe
from "Alternative binary files" (Download PuTTY) - Note 2: to paste text in PuTTY click on mouse right botton inside PuTTY terminal
For information on how to use SSH Keys with PuTTY see here (specifically "Use Existing Public and Private Keys" and "Connect to Server with Private Key" sections)
Briefly: open PuTTY; on the right pannel Connection > SSH > Auth upload the key.ppk in Private key file for authentication, then follow the instructions below
Connection settings (after preparing PuTTY to use SSH key):
Yellow steps are optional steps. They serve to avoid introducing always 1-3 steps. After it is saved, everytime it is necessary to connect the VM, it is only need to click 6-8.
Using terminal
From the Local computer to the VM
scp -i </path/to/provided/private/ssh/key/mgmc.key> </path/to/local/data/file> cloud-user@<VM.IP>:</path/to/data/file>
From the VM to the Local computer
scp -i </path/to/provided/private/ssh/key/mgmc.key> cloud-user@<VM.IP>:</path/to/data/file> </path/to/local/data/file>
For information on how to use SSH Keys with FileZilla or WinSCP please used the following link
In the VM
Edit ~/.bashrc
(using for example nano ~/.bashrc
) and uncomment force_color_prompt=yes
by removing the #
. More information here.
- Note: After editing exit with
Ctrl + X
; typey
to save changes; don't change the name file by only pressingEnter
.
- htop
- An interactive process viewer for Unix systems
- Allows monitoring VM activity (CPUs and memory usage, proccesses running, etc.)
- GNU Parallel
- Shell tool for executing jobs in parallel using one or more computers.
In the VM
sudo apt-get install -y htop parallel
In the VM
Give right permissions and ownership to the extra volume
sudo chmod 755 /media/volume/
sudo chown cloud-user:cloud-user /media/volume/
Prepare some folders
# Create a folder where all the tools to be used will be placed
mkdir ~/NGStools
# Create a folder to store different databases
mkdir /media/volume/DBs
What is Docker?
"Docker is a tool that can package an application and its dependencies in a virtual container that can run on any Linux server," Lyman explained. "This helps enable flexibility and portability on where the application can run, whether on premise, public cloud, private cloud, bare metal, etc." From here.
Installation
Information on how to install Docker for Ubuntu is available at this link
In the VM
sudo apt-get remove -y docker docker-engine docker.io
sudo apt-get update
sudo apt-get install -y linux-image-extra-$(uname -r) linux-image-extra-virtual
sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
# Key with the fingerprint 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y docker-ce
sudo docker run hello-world
Run Docker without sudo
More information here.
sudo groupadd docker
sudo gpasswd -a $USER docker
# logout and login to activate the changes to groups (close the terminal and open it again)
docker run hello-world
What is getSeqENA?
Download sequences from ENA/SRA databases
Install dependencies
Aspera Connect
In your computer
From the webpage:
- See all installers > Linux > Linux - Select Version > Direct download
- Copy Link Location (Direct download)
In the VM
wget http://download.asperasoft.com/download/sw/connect/3.7.4/aspera-connect-3.7.4.147727-linux-64.tar.gz
tar xf aspera-connect-3.7.4.147727-linux-64.tar.gz
bash aspera-connect-3.7.4.147727-linux-64.sh
rm aspera-connect-3.7.4.147727-linux-64.sh aspera-connect-3.7.4.147727-linux-64.tar.gz
mv ~/.aspera/ ~/NGStools/aspera/
echo "export PATH=$HOME/NGStools/aspera/connect/bin"':$PATH' >> ~/.profile
- information on add path to PATH environmental variable here
NCBI SRA Toolkit
In your computer
In the webpage:
- Ubuntu Linux 64 bit architecture
- Copy Link Location (Ubuntu Linux 64 bit architecture)
In the VM
wget https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.8.2-1/sratoolkit.2.8.2-1-ubuntu64.tar.gz
tar xf sratoolkit.2.8.2-1-ubuntu64.tar.gz
rm sratoolkit.2.8.2-1-ubuntu64.tar.gz
mv sratoolkit.2.8.2-1-ubuntu64/ ~/NGStools/
echo "export PATH=$HOME/NGStools/sratoolkit.2.8.2-1-ubuntu64/bin"':$PATH' >> ~/.profile
Install getSeqENA
In your computer
In the webpage:
- Clone or download > Copy to clipboard
In the VM
git clone https://github.com/B-UMMI/getSeqENA.git
mv getSeqENA/ ~/NGStools/
echo "export PATH=$HOME/NGStools/getSeqENA"':$PATH' >> ~/.profile
What is INNUca?
INNUENDO quality control of reads, de novo assembly and contigs quality assessment, and possible contamination detection
In your computer
In your computer
From the webpage:
- Docker
In the VM
docker pull ummidock/innuca:3.1
What is ReMatCh?
Reads mapping against target sequences, checking mapping and consensus sequences production
In your computer
In the webpage:
- Clone or download > Copy to clipboard
In the VM
git clone https://github.com/B-UMMI/ReMatCh.git
mv ReMatCh/ ~/NGStools/
echo "export PATH=$HOME/NGStools/ReMatCh"':$PATH' >> ~/.profile
What is ABRicate?
Mass screening of contigs for antimicrobial resistance or virulence genes. It comes bundled with seven databases: Resfinder, CARD, ARG-ANNOT, NCBI BARRGD, NCBI, EcOH, PlasmidFinder and VFDB.
In your computer
In UMMI Docker Hub webpage:
- ummidock/abricate
In the VM
docker pull ummidock/abricate:latest
What is Prokka?
Prokka is a software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files
In your computer
In UMMI Docker Hub webpage:
- ummidock/prokka
In the VM
docker pull ummidock/prokka:1.12
What is Roary?
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome
In your computer
In Sanger Pathogens Docker Hub webpage:
- sangerpathogens/roary
In the VM
docker pull sangerpathogens/roary:latest
What is Scoary?
Scoary is designed to take the gene_presence_absence.csv file from Roary as well as a traits file created by the user and calculate the assocations between all genes in the accessory genome and the traits. It reports a list of genes sorted by strength of association per trait.
In the VM
sudo apt-get install python-pip
pip install scoary