You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This script was written as a way to recover only the 16S portion of sequences that may have much longer sequences. It assumes that you have already blasted your long sequences against a reference full length 16S sequence only (the query - ie. E. coli). The blast results should be saved in an xml file. This script then parses that xml file to recover the subject sequence name, and alignment that has had its gaps removed.
I orginally wrote this sequence since many plant pathogen sequences contained the ITS gene & tRNA genes, and would not be aligned with PyNAST.