This is a sub project of altorf, analyzing altorf from proteome perspective. By executing peptide identification with various sequence database using MS-GF+, we can find those peptides which is missing from annotation, but actually translated. PNNL library was used as MS/MS data inputs.
Follow this flow from top to bottom. For further information, please refer to README on each sub directories.
-
- Organizes scattered information about the 112 species in PNNL library.
-
- Picks datasets to analyze, considering MS/MS type and the number of identified peptides.
-
- Downloads picked
.mzML
,.mzid
&.fasta
files from PNNL library ftp.
- Downloads picked
-
- Extracts information from each
.mzid
s on the MS-GF+ configuration for peptide identification.
- Extracts information from each
-
- Creates various sequences database for peptide identification.
-
- Executes peptide identification using MS-GF+.
- MS-GF+
- performs peptide identification
- requires >=JRE 1.6 and Main maemory >=2GB
- Anaconda (ver 3.X)
- BioPython