-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
readMgfData speed-up #319
Comments
Alternatively one could use the |
Yes, initialising S4 objects is costly. The problem with your approach is that the object doesn't get validated. I am not keen on mgf data; my general advice here would be to read the data from the I'll still look into it, and check that the spectra aren't checked twice, once upon initiatisation, as you show above, and again when the |
I would love to use mzML instead of mgf. The problem is that my identification pipeline is using mgf as input and then needs the spectrum titles, which are not available through readMSData, at least not to my knowledge. And yes, it could well be that the validity is checked multiple times. |
I will have a fix later today. |
Thanks :-) |
Let me know if you see the desired improvement. By the way, it should be possible to provide a way to add a custom spectrum title to the mgf file. This would allow you to use the mzML file as main working data and export parts of it to mgf with a title of your choice. |
Using MSnbase:::Spectrum2_mz_sorted had the desired effect. ProteoWizard seems to allow specifying the title in the mgf files and we will look into basing the analysis on mzML in the future. |
It looks that most time is spent on initialization of the Spectrum2 objects (see screenshot below).
As dirty workaround, I substituted (in
extractMgfSpectrum2Info
):by
where
dummy_sp
was defined as empty global Spectrum2 object.As result, reading an mgf-file got much faster (in total 50% less spent. Much less when iterating through the spectra). Things might be missing though.
The text was updated successfully, but these errors were encountered: