enable mzTab for DIA-NN #205

WangHong007 · 2022-07-04T09:25:56Z

No description provided.

This reverts commit 430ae79.

github-actions · 2022-07-04T09:27:08Z

`nf-core lint` overall result: Passed ✅

Posted for pipeline commit 3e3a524

+| ✅ 146 tests passed       |+
#| ❔   3 tests were ignored |#

❔ Tests ignored:

files_exist - File is ignored: conf/igenomes.config
files_exist - File is ignored: conf/test_full.config
files_exist - File is ignored: conf/test.config

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-quantms_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: docs/images/nf-core-quantms_logo_light.png
files_exist - File found: docs/images/nf-core-quantms_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreSchema.groovy
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowQuantms.groovy
files_exist - File found: modules.json
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-quantms_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.show_hidden_params
nextflow_config - Config variable found: params.schema_ignore_params
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.version
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: '1.1dev'
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-quantms_logo_light.png matches the template
files_unchanged - docs/images/nf-core-quantms_logo_light.png matches the template
files_unchanged - docs/images/nf-core-quantms_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreSchema.groovy matches the template
files_unchanged - lib/NfcoreTemplate.groovy matches the template
files_unchanged - .gitignore matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 21.10.3, Config: 21.10.3
readme - README Nextflow minimum version in Quick Start section matched config. README: 21.10.3, Config: 21.10.3
pipeline_todos - No TODO strings found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (170 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_params - Schema matched params returned from nextflow config
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.

Run details

nf-core/tools version 2.4.1
Run at 2022-08-03 11:51:56

pull fixed

WangHong007 · 2022-07-12T06:26:16Z

Hi Yasset! @ypriverol It seems dia container should be updated. This script makes additional use of the Bio Python package.

jpfeuffer · 2022-07-12T09:08:50Z

What do you need from the Bio package? Can we do without?

Is there an example output for the mzTab that is created?

WangHong007 · 2022-07-12T12:53:49Z

An example is here: quantms/out.mztab
We use Bio package to calculate protein coverage and Calculate.Precursor.Mz, and we can set these to "null" for now.

ypriverol · 2022-07-27T11:01:44Z

@WangHong007 Protein groups errors:

I have validated the following mzTab out.mzTab. I found the following error:

[Error-1019] line 103: Column "protein_coverage" value "0.024;0.025" is not a valid Double value.
[Error-1019] line 171: Column "protein_coverage" value "0.032;0.038" is not a valid Double value.
[Error-1019] line 285: Column "protein_coverage" value "0.029;0.035;0.036" is not a valid Double value.
[Error-1019] line 515: Column "protein_coverage" value "0.257;0.257" is not a valid Double value.
[Error-1019] line 884: Column "protein_coverage" value "0.069;0.069" is not a valid Double value.
[Error-1019] line 919: Column "protein_coverage" value "0.047;0.047" is not a valid Double value.
[Error-1019] line 1320: Column "protein_coverage" value "0.027;0.032" is not a valid Double value.
[Error-1019] line 1375: Column "protein_coverage" value "0.052;0.052" is not a valid Double value.

This is mainly because protein groups are added by your script as

PRT	P02919;P02919-2	Penicillin-binding protein 1B	1.26E+06	1.12E+06	1.24E+06	1.34E+06	1.15E+06	1.31E+06	ref_ecoli_k12_ups1_combined	null	null	null	null	null	null	null	0.024;0.025	P02919;P02919-2	0.000861326	null	1178004.286	null	null	1296266.875	null	null	single_protein

This is not valid in the mzTab.

The OpenMS approach @timosachsenberg is to write for each protein group the following:

1- indistinguishable_protein_group: PRT MAPHEAD100010850 null null null mgm-proteins-decoy null null 5.524861878453039e-03 MAPHEAD100010850,MAPHEAD100677610 null 0.061311661311661 2.0 2.0 null null null null null indistinguishable_protein_group In this case select the first protein of the protein group for your protein P02919.
2- Write the two proteins as protein_details. @timosachsenberg has selected to write each member of the group as a single entry with the optional column protein_details. For example: PRT MAPHEAD100010850 null null null mgm-proteins-decoy null null 2.173913043478261e-03 null null 0.072727272727273 null null null null null 2 0 protein_details

With that, you will be able to write of each member of the group the sequence coverage as double and will be a valid protein.

Here is an example:

PXD020692-Sample-12.sdrf_openms_design_openms.mzTab.zip

Replace Bio with pyopenms， disable unique genes matrix and some small fixs

WangHong007 · 2022-07-30T09:53:38Z

Hi ! @vdemichev @ypriverol
How can we identify protein type(opt_global_result_type in mzTab) according to Dia-NN main report and matrixs?

For now, I'm classifying by the number of protein ID separated by semicolons in the columns Protein.Group and Protein.Ids in aforementioned result files.
e.g.
Protein.Group="P09152;P19319", Protein.Ids="P09152;P19319" --> protein_details
Protein.Group="P09152", Protein.Ids="P09152;P19319" --> indistinguishable_protein_group
Protein.Group="P09152", Protein.Ids="P09152" --> single_protein

WangHong007 · 2022-07-30T10:03:26Z

jmztab online: mztabvalidator
An example is here: diatest_out.mztab

Correct protein coverage and unique

ypriverol · 2022-08-02T11:04:45Z

Hi ! @vdemichev @ypriverol How can we identify protein type(opt_global_result_type in mzTab) according to Dia-NN main report and matrixs?

For now, I'm classifying by the number of protein ID separated by semicolons in the columns Protein.Group and Protein.Ids in aforementioned result files. e.g. Protein.Group="P09152;P19319", Protein.Ids="P09152;P19319" --> protein_details Protein.Group="P09152", Protein.Ids="P09152;P19319" --> indistinguishable_protein_group Protein.Group="P09152", Protein.Ids="P09152" --> single_protein

@WangHong007 I don't understand the question here.

WangHong007 · 2022-08-02T11:30:03Z

1- indistinguishable_protein_group: PRT MAPHEAD100010850 null null null mgm-proteins-decoy null null 5.524861878453039e-03 MAPHEAD100010850,MAPHEAD100677610 null 0.061311661311661 2.0 2.0 null null null null null indistinguishable_protein_group In this case select the first protein of the protein group for your protein P02919.
2- Write the two proteins as protein_details. @timosachsenberg has selected to write each member of the group as a single entry with the optional column protein_details. For example: PRT MAPHEAD100010850 null null null mgm-proteins-decoy null null 2.173913043478261e-03 null null 0.072727272727273 null null null null null 2 0 protein_details

The question is to determine the type of protein identification result in the protein subtable. How to get three result types(single_protein, protein_details, Indistinguishable_protein_group) from DIA-NN main report and matrix file.

For now, I use the number of values(one-to-one, many-to-many or one-to-many) corresponding to the two columns Protein.Group and Protein.Ids in DIA-NN result files.

e.g.
Protein.Group="P09152", Protein.Ids="P09152" --> single_protein
Protein.Group="P09152;P19319", Protein.Ids="P09152;P19319" --> protein_details
Protein.Group="P09152", Protein.Ids="P09152;P19319" --> indistinguishable_protein_group

ypriverol · 2022-08-02T21:42:06Z

@WangHong007, here how those cases should be annotated:

Protein.Group="P09152", Protein.Ids="P09152" --> single_protein This is single protein and not protein_details needs to be annotated.
Protein.Group="P09152;P19319", Protein.Ids="P09152;P19319" should be annotated as:
- Select the first protein accession P09152 as the ancore of the group and the others are ambiquity_memebers.
- The opt_global_result_type can be annotated as indistinguishable_protein_group`.
- Each protein of the group P09152; P19319 should be also annotated as protein_details where you add the coverage, score, etc. for each protein.
Protein.Group="P09152", Protein.Ids="P09152;P19319" should be annotated as:
- P09152 as the ancore of the group and the others are ambiquity_members.
- The opt_global_result_type can be annotated as indistinguishable_protein_group`.
- Each protein of the group P09152; P19319 should be also annotated as protein_details where you add the coverage, score, etc. for each protein.

ypriverol · 2022-08-03T08:21:59Z

@WangHong007 The problem that you have there is that the dependency pyopenms is not installed when you run as script using docker.

bin/diann_convert.py

modules/local/diannconvert/main.nf

WangHong007 added 5 commits June 30, 2022 09:14

enable mzTab for DIA-NN

72f77af

Update diann_convert.py

abbd2c1

abandon "openms.tsv"

430ae79

Revert "abandon "openms.tsv""

7a76203

This reverts commit 430ae79.

abandon "openms.tsv"

2be46e6

WangHong007 added 2 commits July 4, 2022 18:55

Update main.nf

6cefcff

Merge pull request #1 from daichengxin/dev

b70e851

pull fixed

disable Bio

b010156

WangHong007 added 2 commits July 30, 2022 17:03

Update diann_convert.py

7899201

Replace Bio with pyopenms， disable unique genes matrix and some small fixs

Merge branch 'bigbio:dev' into dev

2ce758d

Use pyopenms container

98ec03f

Correct protein coverage and unique

WangHong007 and others added 2 commits August 2, 2022 20:06

Update main.nf

a82c4cc

Merge branch 'dev' into dev

e4a5c81

Merge branch 'dev' into dev

8148fcc

This was linked to issues Aug 3, 2022

mzTab for DIA-NN #119

Closed

DIANN changes in the current pipeline #164

Closed

Experimental mass to charge is missing in DIA output report #202

Closed

WangHong007 added 2 commits August 3, 2022 18:54

Fix

d4839c8

Change container

3e3a524

ypriverol requested review from ypriverol and jpfeuffer August 3, 2022 15:30

ypriverol approved these changes Aug 3, 2022

View reviewed changes

bin/diann_convert.py Show resolved Hide resolved

bin/diann_convert.py Show resolved Hide resolved

modules/local/diannconvert/main.nf Outdated Show resolved Hide resolved

ypriverol removed the request for review from jpfeuffer August 3, 2022 15:32

ypriverol approved these changes Aug 3, 2022

View reviewed changes

ypriverol merged commit 4141176 into bigbio:dev Aug 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable mzTab for DIA-NN #205

enable mzTab for DIA-NN #205

WangHong007 commented Jul 4, 2022

github-actions bot commented Jul 4, 2022 •

edited

Loading

❔ Tests ignored:

✅ Tests passed:

Run details

WangHong007 commented Jul 12, 2022

jpfeuffer commented Jul 12, 2022

WangHong007 commented Jul 12, 2022

ypriverol commented Jul 27, 2022

WangHong007 commented Jul 30, 2022

WangHong007 commented Jul 30, 2022

ypriverol commented Aug 2, 2022

WangHong007 commented Aug 2, 2022

ypriverol commented Aug 2, 2022

ypriverol commented Aug 3, 2022

enable mzTab for DIA-NN #205

enable mzTab for DIA-NN #205

Conversation

WangHong007 commented Jul 4, 2022

github-actions bot commented Jul 4, 2022 • edited Loading

nf-core lint overall result: Passed ✅

❔ Tests ignored:

✅ Tests passed:

Run details

WangHong007 commented Jul 12, 2022

jpfeuffer commented Jul 12, 2022

WangHong007 commented Jul 12, 2022

ypriverol commented Jul 27, 2022

WangHong007 commented Jul 30, 2022

WangHong007 commented Jul 30, 2022

ypriverol commented Aug 2, 2022

WangHong007 commented Aug 2, 2022

ypriverol commented Aug 2, 2022

ypriverol commented Aug 3, 2022

github-actions bot commented Jul 4, 2022 •

edited

Loading

`nf-core lint` overall result: Passed ✅