Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msfragger pepxml reader support for c-term modifications #85

Merged
merged 4 commits into from
May 19, 2023

Conversation

yangkl96
Copy link
Contributor

@yangkl96 yangkl96 commented Feb 7, 2023

Thanks for waiting for these changes. I added 'Methyl@E' to psm_reader.yaml to test that fragment m/z calculations are correct for c-term mods. It can be deleted, but I am wondering why only a handful of PTMs are added, rather than putting all the entries in modification.tsv into it?

Tested using pepxml files generated from msfragger search of data from PXD014879, including c-terminal methyl as a variable mod. Ran code below from Python terminal in PyCharm IDE to ensure it works:

from alphabase.psm_reader import *
from alphabase.peptide import *

psm_reader = psm_reader_provider.get_reader("msfragger_pepxml")
msf_df = psm_reader.import_file("20190131_QExHFX3_Ogris_MFPL_gel_PP2A_EV_90p.pepXML")
methyl_df = msf_df[msf_df['mods'].str.contains('Methyl')].copy()
fragment.create_fragment_mz_dataframe_by_sort_precursor(methyl_df, ['b_z1', 'y_z1', 'b_z2', 'y_z2', 'b_modloss_z1', 'y_modloss_z1', 'b_modloss_z2', 'y_modloss_z2'])

if site < cterm_position:
mod_mass = mod_mass - AA_ASCII_MASS[ord(sequence[site-1])]
else:
mod_mass -= (MASS_H + MASS_O + MASS_PROTON)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing a proton does not make sense here, as only neutral masses are involved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked my results with this file:
20190131_QExHFX3_Ogris_MFPL_gel_PP2A_EV_90p.zip

With the current code from the pull request, for the peptide RTPDYFL with Methyl@cterm, we get the following fragment masses, where both b ions (without the cterm mod) and y ions (with the cterm mod) match the calculated fragment masses from http://db.systemsbiology.net/proteomicsToolkit/FragIonServlet?sequence=RTPDYFL&massType=monoRB&charge=1&bCB=1&yCB=1&nterm=0&cterm=14.015650&addModifType=&addModifVal=

b_z1           y_z1           b_z2           y_z2  
157.108387     769.376683     79.057832      385.191980
258.156066     668.329005     129.581671     334.668140

This is also present in the MSFragger source code, where an extra mass of proton is present when reporting mass of cterm mod, but not nterm. This is also evident in the pepxml entry

<search_hit peptide="RTPDYFL" massdiff="0.00225830078125" calc_neutral_pep_mass="924.47046" peptide_next_aa="-" num_missed_cleavages="1" num_tol_term="2" protein_descr="Serine/threonine-protein phosphatase 2A catalytic subunit beta isoform OS=Mus musculus OX=10090 GN=Ppp2cb PE=1 SV=1" num_tot_proteins="2" tot_num_ions="12" hit_rank="1" num_matched_ions="7" protein="sp|P62715|PP2AB_MOUSE" peptide_prev_aa="R" is_rejected="0">
<alternative_protein protein_descr="Serine/threonine-protein phosphatase 2A catalytic subunit alpha isoform OS=Mus musculus OX=10090 GN=Ppp2ca PE=1 SV=1" protein="sp|P63330|PP2AA_MOUSE" peptide_prev_aa="R" peptide_next_aa="-" num_tol_term="2"/>
<modification_info mod_cterm_mass="32.025665" modified_peptide="RTPDYFLc[32]">
</modification_info>

where the cterm_mass is the sum of methylation mass (14.02 Da) + the masses of hydrogen, oxygen, and proton.

Can I provide any other tests to show that the mass of the proton should be included here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try it by myself. If this is indeed the case, I will merge this PR.

PS: it is still wired for me, removing a proton will retain the electron, leading to a negative charge...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yangkl96 , i think this is a H2O, not HO+proton, please double check this and then I will merge this PR, thanks.

Copy link
Collaborator

@jalew188 jalew188 May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I highly doubt that if a proton should be substracted here, as it will result in a negtive ion. This mod_cterm_mass value in pepxml highly depends on its definition in pepxml schema which defines that this value is a residue mass, or a residue mass plus an H2O compound, or something plus a proton. However, I cannot find the definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @jalew188. I talked to Fengchao and we agree that it was weird that our pepxml writer was using proton mass instead of another hydrogen mass. We may change this in the future MSFragger code, but it should work fine now since the ppm difference is so small. I have pushed the correction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i agree. Just to make sure the substractor is correct, and not confused

if mod_name.endswith('C-term'):
_mod = mod_name
else:
_mod = mod_name.split('@')[0]+'@Any C-term' #what if only Protein C-term is listed?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can check whether modname@Any C-term or modname@Protein C-term is in the MOD_MASS dict

@jalew188 jalew188 added this to the 1.1.0 milestone May 11, 2023
@jalew188 jalew188 merged commit 2bee677 into MannLabs:development May 19, 2023
@yangkl96 yangkl96 deleted the development branch May 19, 2023 15:04
@yangkl96 yangkl96 restored the development branch May 19, 2023 15:04
@yangkl96 yangkl96 deleted the development branch May 19, 2023 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants