-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse Annotated MGFs #12
Conversation
Thank you for this very nice contribution! I'll be happy to merge it after some minor wrinkles are ironed out, which I am happy to assist with. I have pushed a commit where I addressed most of the things I noticed, the main one being that Apart from that, I have just a couple of questions left:
|
This is probably minor, but the changes do break previously pickled objects since it changes the signature of |
Thank you for the comment @mobiusklein. I'm not sure I like the idea of permanently changing the code for a one-time reason, so it got me thinking if I could write a one-off script to convert the pickled data based on the latest version of the code: basically copy the old definitions into the script, patch them onto the actual module, read in the data, then dump it to a new file. I'm not sure what to do about plain Here's a version that only deals with |
Thank you very much for your comments! I implemented the changes you suggested, including those by @mobiusklein (you can decide if you want to keep them). As you already figured @levitsky , pDeep has annotations for all its peaks, so masked arrays aren't necessary - I haven't seen any files where the annotations were only present partially. |
Thank you, that fixes my concern. |
OK, thank you all! |
Thanks, no this is not a problem for my use case either. |
This PR extends the MGF module so that MGFs with annotated mass spectra can be parsed (for an example, see test_annotated.mgf ).
These files are e.g. output by the MS/MS prediction tool pDeep (https://github.com/pFindStudio/pDeep).
The new feature can be switched on by setting
read_ions=True
in themgf.read()
routine.