feature/skip preliminary analysis on dia #335

jspaezp · 2024-01-09T17:06:01Z

This PR adds the option to skip the preliminary steps of the dia analysis. (only do a single individual analysis and a single consensus analysis).

(please squash on merge ...)

github-actions · 2024-01-09T17:07:22Z

`nf-core lint` overall result: Passed ✅

Posted for pipeline commit 639e507

+| ✅ 160 tests passed       |+
#| ❔   4 tests were ignored |#

❔ Tests ignored:

files_exist - File is ignored: conf/igenomes.config
files_exist - File is ignored: conf/test_full.config
files_exist - File is ignored: conf/test.config
files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-quantms_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: docs/images/nf-core-quantms_logo_light.png
files_exist - File found: docs/images/nf-core-quantms_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowQuantms.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-quantms_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.3.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-quantms_logo_light.png matches the template
files_unchanged - docs/images/nf-core-quantms_logo_light.png matches the template
files_unchanged - docs/images/nf-core-quantms_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreTemplate.groovy matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
readme - README Zenodo placeholder was replaced with DOI.
pipeline_todos - No TODO strings found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (194 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/tsv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: release-announcments.yml
actions_schema_validation - Workflow validation passed: awstest.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' contains report_section_order
multiqc_config - 'assets/multiqc_config.yml' contains export_plots
multiqc_config - 'assets/multiqc_config.yml' contains report_comment
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.11.1
Run at 2024-01-09 19:46:13

ypriverol · 2024-01-09T17:40:32Z

Before reviewing the PR in details @jspaezp can you explain the impact of not doing preliminary step. I thought the idea with the preliminary step is to be able to generate the library for the final analysis @daichengxin ?

jspaezp · 2024-01-09T18:25:47Z

@ypriverol Absolutely!

The idea behind the feature is to allow two-stage runs of the pipeline, where a subset of files are used to generate the empirical library and then the extraction is done in all/the rest.
This is especially relevant because the empirical lib generation stage requires staging all the files (all .d/mzml + all .quant + fasta + predicted library) in the same compute environment/disk BUT not for the final quant stage (which does not require the .d/mzml, but does need the .quant + lib). (relevant discussion: https://twitter.com/J_my_sci/status/1744152837247095086)

jspaezp · 2024-01-09T18:33:12Z

btw... i dont believe any of the error in the ci/cd checks are caused by my changes ... I see a couple of files missing upstream and mamba not being able to generate environments.

ypriverol · 2024-01-09T19:03:54Z

We have to solve that, it is a work in progress because we have to move some files from the current server to PRIDE.

jspaezp · 2024-01-10T15:58:48Z

thanks @daichengxin for the review!

ypriverol · 2024-01-10T16:05:22Z

@jspaezp do you know which impact can have if you sub-select a group of files compared to all the files in the final results?

Another small question, do you think the selection of these files could be done based on replicates technical and biological + the factor value.

jspaezp · 2024-01-10T16:13:29Z

I have not tested systematically this to be sure BUT. I would assume that (1) you could miss peptides that show up specifically in one of the conditions/files not used in the library construction. (2) You would have a slightly worse estimate of your FDR due to the smaller sample size.

I am not sure what public dataset could be used to test this hypothesis ... And I am assuming that data sets with more variability will be more prone to have changes depending on the analysis workflow. (I would be surprised if a 'cell line'+treatment dataset of 500 files looks any different if the library is done with the 500 files or with 100).

ypriverol · 2024-01-10T16:16:51Z

I have not tested systematically this to be sure BUT. I would assume that (1) you could miss peptides that show up specifically in one of the conditions/files not used in the library construction. (2) You would have a slightly worse estimate of your FDR due to the smaller sample size.

I am not sure what public dataset could be used to test this hypothesis ... And I am assuming that data sets with more variability will be more prone to have changes depending on the analysis workflow. (I would be surprised if a 'cell line'+treatment dataset of 500 files looks any different if the library is done with the 500 files or with 100).

Do you think in the logic we can use some of the SDRF information to do this selection?

jspaezp · 2024-01-10T16:32:50Z

I was thinking about this for a while and there might be a way, but it would certainly require a lot more nexflow plumbing that I really want to/can afford to devote right now ... In addition, I am not sure what cvparam could be used to denote that those should be used for the lib ... 1002752 ?? maybe ?

We could certainly have it as an open issue to implement the feature in the future (we could also discuss the right way to do it in the issue).

In other words, that is a much more complex feature than this PR attempts to be and I believe this feature by itself is complementary to that one.

jspaezp and others added 10 commits September 15, 2023 15:15

stuff

ec28eab

first branch commit

b1cee70

added params

0bbfaaf

fixed unused reference

0c12076

trailing comma

2ab21a3

stuff

4217740

added node modules to gitignore

4b0badc

merged branch onto bigbio dev

8237a29

rolled back sdrf parsing

94e6d16

Merge branch 'dev' into bb_feature/skip_prelim

66a0f7a

ypriverol requested review from ypriverol and daichengxin January 9, 2024 19:06

removed accidental file

639e507

ypriverol mentioned this pull request Jan 9, 2024

Phospho data for the CI/CD is missing #336

Closed

daichengxin approved these changes Jan 10, 2024

View reviewed changes

ypriverol approved these changes Jan 10, 2024

View reviewed changes

ypriverol mentioned this pull request Jan 10, 2024

Improve the skip pre-analysis parameter logic #337

Closed

ypriverol merged commit eb6985f into bigbio:dev Jan 10, 2024
14 of 17 checks passed

This was referenced Jan 11, 2024

Benchmark skip pre-analysis vs full preanalysis. #338

Closed

quantms failing with a big dataset 15K files #339

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/skip preliminary analysis on dia #335

feature/skip preliminary analysis on dia #335

jspaezp commented Jan 9, 2024 •

edited

Loading

github-actions bot commented Jan 9, 2024 •

edited

Loading

❔ Tests ignored:

✅ Tests passed:

Run details

ypriverol commented Jan 9, 2024

jspaezp commented Jan 9, 2024 •

edited

Loading

jspaezp commented Jan 9, 2024

ypriverol commented Jan 9, 2024

jspaezp commented Jan 10, 2024

ypriverol commented Jan 10, 2024 •

edited

Loading

jspaezp commented Jan 10, 2024

ypriverol commented Jan 10, 2024

jspaezp commented Jan 10, 2024

feature/skip preliminary analysis on dia #335

feature/skip preliminary analysis on dia #335

Conversation

jspaezp commented Jan 9, 2024 • edited Loading

github-actions bot commented Jan 9, 2024 • edited Loading

nf-core lint overall result: Passed ✅

❔ Tests ignored:

✅ Tests passed:

Run details

ypriverol commented Jan 9, 2024

jspaezp commented Jan 9, 2024 • edited Loading

jspaezp commented Jan 9, 2024

ypriverol commented Jan 9, 2024

jspaezp commented Jan 10, 2024

ypriverol commented Jan 10, 2024 • edited Loading

jspaezp commented Jan 10, 2024

ypriverol commented Jan 10, 2024

jspaezp commented Jan 10, 2024

jspaezp commented Jan 9, 2024 •

edited

Loading

github-actions bot commented Jan 9, 2024 •

edited

Loading

`nf-core lint` overall result: Passed ✅

jspaezp commented Jan 9, 2024 •

edited

Loading

ypriverol commented Jan 10, 2024 •

edited

Loading