Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable spectra_ref in mzTab #240

Merged
merged 10 commits into from
Nov 8, 2022
Merged

enable spectra_ref in mzTab #240

merged 10 commits into from
Nov 8, 2022

Conversation

WangHong007
Copy link
Contributor

No description provided.

@github-actions
Copy link

github-actions bot commented Nov 3, 2022

nf-core lint overall result: Passed ✅

Posted for pipeline commit 2333525

+| ✅ 155 tests passed       |+
#| ❔   3 tests were ignored |#

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.6
  • Run at 2022-11-08 10:32:47

@ypriverol
Copy link
Member

@WangHong007 the PythonBlack is failing can you fix that error.

@WangHong007
Copy link
Contributor Author

got it @ypriverol

@ypriverol
Copy link
Member

Copy link
Collaborator

@jpfeuffer jpfeuffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I saw that you are using chunking to load the mzml info CSV.
Is it really that big or slow? I think we should:

  • speed up the code. Use less appends, less for loops, less string data
  • write one file per mzml or even per mzml+level

@@ -20,17 +21,19 @@ def parse_mzml(file_name, file_columns):
name = os.path.split(file_name)[1]
id = i.getNativeID()
MSLevel = i.getMSLevel()
rt = i.getRT() if i.getRT() else np.nan
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about MS3 ?

Copy link
Contributor Author

@WangHong007 WangHong007 Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* speed up the code. Use less appends,  less for loops, less string data

This is about mzml_statistics.py or diann_convert.py?

* write one file per mzml or even per mzml+level

Will do.

What about MS3 ?

Does MS3 have the same attributes as MS2? If so, it can be included in the statement block of MS2 when determining the MSLevel.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first comment was about the reading of mzml_info in pmultiqc.

You will usually only have MS3 in TMT. It is similar to MS2 but often has multiple precursors. Not sure if there are important QC metrics on them. We can do it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we only count the spectrum id and retention time of MS3 (same as MS1). It will also be recorded in the corresponding *_mzml_info.tsv



def mzml_dataframe(mzml_folder):

file_columns = ["File_Name", "SpectrumID", "MSLevel", "Charge", "MS2_peaks", "Base_Peak_Intensity"]
file_columns = ["File_Name", "SpectrumID", "MSLevel", "Charge", "MS2_peaks", "Base_Peak_Intensity", "Retention_Time", "Exp_Mass_To_Charge"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if that works without an update in pmultiqc.

@jpfeuffer
Copy link
Collaborator

Comments are basically addressed but I think pmultiqc needs to be updated now to make the tests pass?

@ypriverol
Copy link
Member

ypriverol commented Nov 7, 2022

#241 I guess is better I close my PR @WangHong007 and you update yours. I did the changes in your branch @WangHong007 .

@WangHong007
Copy link
Contributor Author

@ypriverol got it.

@WangHong007 WangHong007 requested a review from jpfeuffer November 8, 2022 12:07
@ypriverol ypriverol self-requested a review November 8, 2022 12:54
@ypriverol ypriverol merged commit a2b2be6 into bigbio:dev Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants