Skip to content

gturco/co-anno

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Author:Gina Turco (gturco), Brent Pedersen (brentp)
Email:gturco88@gmail.com
License:MIT

Coding sequences of one species are blast to the noncoding sequences of the other. Blastn is ran at a word size of 20 and E-Value < 0.001. Blast hits that hit the same coding region are summed by length. Groups with a sum greater then 100 are recorded as a missed exon strand.

http://upload.wikimedia.org/wikipedia/commons/b/b4/Coanno.png

  • Python version >= 2.7
  • blast (download latest and run)
  • lastz (download latest .tar.gz; configure; make; make install) and adjust path in quota.sh)
  • bpbio (svn checkout http://bpbio.googlecode.com/svn/trunk/ bpbio-read-only) (run biostuff,coanno and bblast sudo python setup.py install)

BCBio module required:

python scripts/gff_to_bed.py rice_v6.gff >rice_v6.bed
  • If you have access to Coge the fasta and bed file for each organism can be obtained using export_to_bed.pl e.g.:

    perl scripts/export_to_bed.pl \
                          -fasta_name rice_v6.fasta \
                          -dsg 8163 \
                          -name_re "^Os\d\dg\d{5}$" > rice_v6.bed
    

    where dsg is from CoGe OrganismView and the prefix for the .bed and .fasta file must be the same (in this case rice_v6). You likely need to run this on new synteny and then copy the .bed and .fasta files to the data/ directory. The -name_re regular expression is not required, but in this case, it will prefer the readable Os01g101010 names over the names like m103430.

  • Once only: edit quota.sh to correct path for quota-alignment
  • edit quota.sh to the correct ORGA, ORGB, QUOTA
  • run cmd: sh run.sh #that will call quota.sh (this will take a long time as it's doing a full blast (lastz) and then all of quota align, then cns pipeline).
  • this will create png's for the dotplots. check those to make sure the quota-blocks look correct.
  • Query and subject CNS position
  • Missing Exons from ORGA ORGB blast
  • CNS blast to RNA file
  • CNS blast to proteins file
  • CNS assigned to nearest Ortholog

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages