Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calc_precursor_mz leads to unexpected behavior #92

Closed
GeorgWa opened this issue Mar 8, 2023 · 2 comments
Closed

calc_precursor_mz leads to unexpected behavior #92

GeorgWa opened this issue Mar 8, 2023 · 2 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@GeorgWa
Copy link
Collaborator

GeorgWa commented Mar 8, 2023

Hi Feng, I observed the following bug when using AB to create multiplexed DIA libraries. It's technically not a bug but more of a incompatible behavior between functions which might confuse users. Let me know what you think, I'm happy to make a PR.

Describe the bug
When calling calc_precursor_mz on a spectral library, clip_by_precursor_mz_ is always invoked, which messes up the precursor fragment mapping.

To Reproduce

  1. Create a library with a preecursor just below the upper mz limit:
precursor_df = pd.DataFrame([
    {'sequence': 'AGHCEWQMKPERGWWWWWPPWWWWGGGGAGGAG', 'mods': 'Dimethyl@Any N-term;Dimethyl@K', 'mod_sites': '0;8', 'charge': 2},
    {'sequence': 'AGHCEWQMKPE', 'mods': 'Dimethyl@Any N-term;Dimethyl@K', 'mod_sites': '0;8', 'charge': 3},
    {'sequence': 'AGHCEWQMKPERGWWWWWPPWWWWGGGGAGGAGG', 'mods': '', 'mod_sites': '', 'charge': 2},
])

spec_lib = SpecLibBase(
    ['b_z1','b_z2','y_z1','y_z2'],
    decoy='pseudo_reverse',
    precursor_mz_max=2000,
)
spec_lib._precursor_df = precursor_df
spec_lib.calc_precursor_mz()
spec_lib.calc_fragment_mz_df()

spec_lib.precursor_df
sequence mods mod_sites charge nAA precursor_mz frag_start_idx frag_stop_idx
AGHCEWQMKPE Dimethyl@Any N-term;Dimethyl@K 0;8 3 11 457.877652 0 10
AGHCEWQMKPERGWWWWWPPWWWWGGGGAGGAG Dimethyl@Any N-term;Dimethyl@K 0;8 2 33 1997.896037 10 42
AGHCEWQMKPERGWWWWWPPWWWWGGGGAGGAGG     2 34 1998.375468 42 75
  1. Modify the sequence or modification which will change the m/z of some precursors, recalculate the precursor mz:
spec_lib._precursor_df['mods'] = spec_lib._precursor_df['mods'].str.replace('Dimethyl', 'Dimethyl:2H(6)13C(2)')
spec_lib.calc_precursor_mz()
sequence mods mod_sites charge nAA precursor_mz frag_start_idx frag_stop_idx
AGHCEWQMKPE Dimethyl:2H(6)13C(2)@Any N-term;Dimethyl:2H(6)... 0;8 3 11 463.240566 0 10
AGHCEWQMKPERGWWWWWPPWWWWGGGGAGGAGG     2 34 1998.375468 42 75
  1. recalculate the fragment masses:
    spec_lib.calc_fragment_mz_df()
File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/var/folders/lc/9594t94d5b5_gn0y04w1jh980000gn/T/ipykernel_6587/3999293303.py", line 1, in <module>
    spec_lib.calc_fragment_mz_df()
  File "/Users/georgwallmann/Documents/git/alphabase/alphabase/spectral_library/base.py", line 361, in calc_fragment_mz_df
    )
  File "/Users/georgwallmann/Documents/git/alphabase/alphabase/peptide/fragment.py", line 818, in create_fragment_mz_dataframe
  File "/Users/georgwallmann/Documents/git/alphabase/alphabase/peptide/fragment.py", line 876, in create_fragment_mz_dataframe
    '''
  File "/Users/georgwallmann/Documents/git/alphabase/alphabase/peptide/fragment.py", line 435, in mask_fragments_for_charge_greater_than_precursor_charge
    elif frag_type == 'y':
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexing.py", line 815, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexing.py", line 698, in _get_setitem_indexer
    return self._convert_tuple(key)
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexing.py", line 897, in _convert_tuple
    keyidx = [self._convert_to_indexer(k, axis=i) for i, k in enumerate(key)]
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexing.py", line 897, in <listcomp>
    keyidx = [self._convert_to_indexer(k, axis=i) for i, k in enumerate(key)]
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexing.py", line 1394, in _convert_to_indexer
    key = check_bool_indexer(labels, key)
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexing.py", line 2567, in check_bool_indexer
    return check_array_indexer(index, result)
  File "/Users/georgwallmann/miniconda3/envs/alpha/lib/python3.9/site-packages/pandas/core/indexers/utils.py", line 553, in check_array_indexer
    raise IndexError(
IndexError: Boolean index has wrong length: 43 instead of 75

Expected behavior
calc_fragment_mz_df requires a continous frag_start_idx, frag_stop_idx, which is not compatible with the way clip_by_precursor_mz_ works. It is also not immediatley clear thatclip_by_precursor_mz_ has been called and precursors were removed.

Possible Solutions

  • don't call clip_by_precursor_mz_ by default within calc_precursor_mz
  • issue a warning: n precursors were remove because they were outside the mz limits
  • call spec_lib.remove_unused_fragments() right after clip_by_precursor_mz_
@GeorgWa GeorgWa added the bug Something isn't working label Mar 8, 2023
@jalew188
Copy link
Collaborator

jalew188 commented Mar 8, 2023

Good catch! And good suggestions!

@GeorgWa
Copy link
Collaborator Author

GeorgWa commented Mar 8, 2023

Thanks, will open a PR later!

@GeorgWa GeorgWa added this to the 1.1.0 milestone Apr 17, 2023
@GeorgWa GeorgWa closed this as completed Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants