Entry points into the database

This page describes the scripts that you will often use to begin your pipelines when starting a comparative genomics analysis. They will return results in formats appropriate for piping to other ITEP scripts.

Setting up paths

Before running any ITEP scripts make sure you source the SourceMe.sh file (in the root directory of the repository) to set up your paths. This will ensure that you can run the ITEP scripts from anywhere on your machine.

$ source SourceMe.sh

Starting with an annotation

Getting the ITEP gene ID for an annotation

You can pull out all genes that match a search string using the following command:

$ db_getGenesWithAnnotation.py "Search_string"

Getting a list of clusters associated with an annotation

You can get a list of clusters containing genes that match a search string using

$ db_getClustersWithAnnotation.py "Search_string"

It returns a table containing the run ID, cluster ID, gene IDs and annotations for clusters that matched the search string (note only the genes that matched the search string are provided - if you want all of the genes in the cluster you should pipe the results into db_getGenesInClusters.py).

Starting with a locus tag or other alias

You can search for aliases in the same manner as searching for annotations:

$ db_getGenesWithAnnotation.py "Alias"
$ db_getClustersWithAnnotation.py "Alias"

If locus tags are available in the source Genbank files they will automatically be searchable when they are loaded. Otherwise, if you want to be able to search for a name you should add it to the aliases file ($root/aliases/aliases) and re-run setup_step1.sh.

Starting with a cluster run ID

You can get a list of cluster run IDs that are currently loaded into ITEP with the db_getAllClusterRuns.py function

$ db_getAllClusterRuns.py

You can then choose one from the list and pipe \ use it in other analyses in the other scripts.

Starting with a contig ID

You can get a list of valid ITEP contig IDs using

$ db_getContigs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly