-
Notifications
You must be signed in to change notification settings - Fork 80
inject study_type in EBI and improvements to current automatic processing pipeline #3023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #3023 +/- ##
=======================================
Coverage 94.94% 94.94%
=======================================
Files 74 74
Lines 14263 14263
=======================================
Hits 13542 13542
Misses 721 721 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good, overall. Are the modifications to the automatic processing pipeline a result of failed runs, or what motivated those changes?
qiita_ware/ebi.py
Outdated
@@ -356,6 +356,11 @@ def generate_study_xml(self): | |||
study_title = ET.SubElement(descriptor, 'STUDY_TITLE') | |||
study_title.text = escape(clean_whitespace(self.study_title)) | |||
|
|||
# study type is depricated and not displayed anywhere on EBI-ENA; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# study type is depricated and not displayed anywhere on EBI-ENA; | |
# study type is deprecated and not displayed anywhere on EBI-ENA; |
scripts/qiita-auto-processing
Outdated
# getting all jobs, includen hiddens, in case the job failed | ||
jobs = a.jobs(cmd=cmd['command'], show_hidden=True) | ||
params = [j.parameters.values for j in jobs] | ||
params = [{k: str(v) for k, v in j.parameters.values.items() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the string conversion needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent question; for some reason (for old jobs) the parameters are not string and the parameters here are defined as strings; the easiest is to make sure that everything is string
scripts/qiita-auto-processing
Outdated
k: str(v) | ||
for k, v in cpp.values.items() | ||
if k not in cmd['ignore_parameters']} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This list comprehension pattern is repeated a couple times, any chance this can be put in a single function, or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! The changes are basically cause (1) we needed to reprocess WGS with the new pipeline (had to change versions) while the first time we only did target gene (thus, we were missing some parameters for WGS); (2) realized that we can have the same command with multiple core parameters (so we need the ignore parameters, including that some of the first version did not converted everything to string); and some general clean up.
scripts/qiita-auto-processing
Outdated
# getting all jobs, includen hiddens, in case the job failed | ||
jobs = a.jobs(cmd=cmd['command'], show_hidden=True) | ||
params = [j.parameters.values for j in jobs] | ||
params = [{k: str(v) for k, v in j.parameters.values.items() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent question; for some reason (for old jobs) the parameters are not string and the parameters here are defined as strings; the easiest is to make sure that everything is string
Thanks @antgonza |
* inject study_type in EBI and improvements to current automatic processing pipeline (#3023) * inject study_type in ebi and improvements to current automatic proecssing pipeline * addressing @ElDeveloper comments * some general fixes/additions for next release (#3026) * some general fixes/additions for next release * adding test for not None job.release_validator_job * fix #2839 * fix #2868 (#3028) * fix #2868 * 2nd round * fix errors * more changes * fix errors * fix ProcessingJobTest * fix PY_PATCH * add missing TRN.add * encapsulated_query -> perform_as_transaction * fix #3022 (#3030) * fix #3022 * adding tests * fix #2320 (#3031) * fix #2320 * adding prints to debug * children -> 1 * APIArtifactHandlerTest -> APIArtifactHandlerTests * configure_biom * qdb.util.activate_or_update_plugins * improving code * almost there * add values.template * fix filepaths * filepaths -> files * fixing errors * add prep.artifact insertion * addressing @ElDeveloper comments * fix artifact_definition active command * != -> == * Added three tutorial sections to the Qiita documentation (#3032) * Added three tutorial sections to the Qiita documentation: 'Retrieving Public Data for Own Analysis' and 'Processing public data retrieved with redbiom' to the redbiom tab, and 'Statistical Analysis to Justify Clinical Trial Sample Size Tutorial' to the analyzing samples tab. * Update redbiom.rst * Update redbiom.rst * Update redbiom.rst * Further updates to redbiom.rst and the Stats tutorial. * update redbiom.rst * Finished proof-reading * Placed all three tutorials/sections together under Introduction to the download and analysis of public Qiita data * added a new introduction, with links to the three sections * Added figures to stats tutorial and contexts explanation * Added figures to stats tutorial and contexts explanation * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Antonio Gonzalez <antgonza@gmail.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * 092020 (#3033) * 092020 * connect artifact with job * rm INSERT qiita.artifact_processing_job * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
* Version 092020 (#3034) * inject study_type in EBI and improvements to current automatic processing pipeline (#3023) * inject study_type in ebi and improvements to current automatic proecssing pipeline * addressing @ElDeveloper comments * some general fixes/additions for next release (#3026) * some general fixes/additions for next release * adding test for not None job.release_validator_job * fix #2839 * fix #2868 (#3028) * fix #2868 * 2nd round * fix errors * more changes * fix errors * fix ProcessingJobTest * fix PY_PATCH * add missing TRN.add * encapsulated_query -> perform_as_transaction * fix #3022 (#3030) * fix #3022 * adding tests * fix #2320 (#3031) * fix #2320 * adding prints to debug * children -> 1 * APIArtifactHandlerTest -> APIArtifactHandlerTests * configure_biom * qdb.util.activate_or_update_plugins * improving code * almost there * add values.template * fix filepaths * filepaths -> files * fixing errors * add prep.artifact insertion * addressing @ElDeveloper comments * fix artifact_definition active command * != -> == * Added three tutorial sections to the Qiita documentation (#3032) * Added three tutorial sections to the Qiita documentation: 'Retrieving Public Data for Own Analysis' and 'Processing public data retrieved with redbiom' to the redbiom tab, and 'Statistical Analysis to Justify Clinical Trial Sample Size Tutorial' to the analyzing samples tab. * Update redbiom.rst * Update redbiom.rst * Update redbiom.rst * Further updates to redbiom.rst and the Stats tutorial. * update redbiom.rst * Finished proof-reading * Placed all three tutorials/sections together under Introduction to the download and analysis of public Qiita data * added a new introduction, with links to the three sections * Added figures to stats tutorial and contexts explanation * Added figures to stats tutorial and contexts explanation * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Antonio Gonzalez <antgonza@gmail.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * 092020 (#3033) * 092020 * connect artifact with job * rm INSERT qiita.artifact_processing_job * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * fix #3036 Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
…3040) * Version 092020 (#3034) * inject study_type in EBI and improvements to current automatic processing pipeline (#3023) * inject study_type in ebi and improvements to current automatic proecssing pipeline * addressing @ElDeveloper comments * some general fixes/additions for next release (#3026) * some general fixes/additions for next release * adding test for not None job.release_validator_job * fix #2839 * fix #2868 (#3028) * fix #2868 * 2nd round * fix errors * more changes * fix errors * fix ProcessingJobTest * fix PY_PATCH * add missing TRN.add * encapsulated_query -> perform_as_transaction * fix #3022 (#3030) * fix #3022 * adding tests * fix #2320 (#3031) * fix #2320 * adding prints to debug * children -> 1 * APIArtifactHandlerTest -> APIArtifactHandlerTests * configure_biom * qdb.util.activate_or_update_plugins * improving code * almost there * add values.template * fix filepaths * filepaths -> files * fixing errors * add prep.artifact insertion * addressing @ElDeveloper comments * fix artifact_definition active command * != -> == * Added three tutorial sections to the Qiita documentation (#3032) * Added three tutorial sections to the Qiita documentation: 'Retrieving Public Data for Own Analysis' and 'Processing public data retrieved with redbiom' to the redbiom tab, and 'Statistical Analysis to Justify Clinical Trial Sample Size Tutorial' to the analyzing samples tab. * Update redbiom.rst * Update redbiom.rst * Update redbiom.rst * Further updates to redbiom.rst and the Stats tutorial. * update redbiom.rst * Finished proof-reading * Placed all three tutorials/sections together under Introduction to the download and analysis of public Qiita data * added a new introduction, with links to the three sections * Added figures to stats tutorial and contexts explanation * Added figures to stats tutorial and contexts explanation * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Antonio Gonzalez <antgonza@gmail.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * 092020 (#3033) * 092020 * connect artifact with job * rm INSERT qiita.artifact_processing_job * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * adding job submit ENVIRONMENT and minor improvement to runWorkflow * fix test Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
* Version 092020 (#3034) * inject study_type in EBI and improvements to current automatic processing pipeline (#3023) * inject study_type in ebi and improvements to current automatic proecssing pipeline * addressing @ElDeveloper comments * some general fixes/additions for next release (#3026) * some general fixes/additions for next release * adding test for not None job.release_validator_job * fix #2839 * fix #2868 (#3028) * fix #2868 * 2nd round * fix errors * more changes * fix errors * fix ProcessingJobTest * fix PY_PATCH * add missing TRN.add * encapsulated_query -> perform_as_transaction * fix #3022 (#3030) * fix #3022 * adding tests * fix #2320 (#3031) * fix #2320 * adding prints to debug * children -> 1 * APIArtifactHandlerTest -> APIArtifactHandlerTests * configure_biom * qdb.util.activate_or_update_plugins * improving code * almost there * add values.template * fix filepaths * filepaths -> files * fixing errors * add prep.artifact insertion * addressing @ElDeveloper comments * fix artifact_definition active command * != -> == * Added three tutorial sections to the Qiita documentation (#3032) * Added three tutorial sections to the Qiita documentation: 'Retrieving Public Data for Own Analysis' and 'Processing public data retrieved with redbiom' to the redbiom tab, and 'Statistical Analysis to Justify Clinical Trial Sample Size Tutorial' to the analyzing samples tab. * Update redbiom.rst * Update redbiom.rst * Update redbiom.rst * Further updates to redbiom.rst and the Stats tutorial. * update redbiom.rst * Finished proof-reading * Placed all three tutorials/sections together under Introduction to the download and analysis of public Qiita data * added a new introduction, with links to the three sections * Added figures to stats tutorial and contexts explanation * Added figures to stats tutorial and contexts explanation * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Antonio Gonzalez <antgonza@gmail.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * 092020 (#3033) * 092020 * connect artifact with job * rm INSERT qiita.artifact_processing_job * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * fix #2920 * fix test Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
* Version 092020 (#3034) * inject study_type in EBI and improvements to current automatic processing pipeline (#3023) * inject study_type in ebi and improvements to current automatic proecssing pipeline * addressing @ElDeveloper comments * some general fixes/additions for next release (#3026) * some general fixes/additions for next release * adding test for not None job.release_validator_job * fix #2839 * fix #2868 (#3028) * fix #2868 * 2nd round * fix errors * more changes * fix errors * fix ProcessingJobTest * fix PY_PATCH * add missing TRN.add * encapsulated_query -> perform_as_transaction * fix #3022 (#3030) * fix #3022 * adding tests * fix #2320 (#3031) * fix #2320 * adding prints to debug * children -> 1 * APIArtifactHandlerTest -> APIArtifactHandlerTests * configure_biom * qdb.util.activate_or_update_plugins * improving code * almost there * add values.template * fix filepaths * filepaths -> files * fixing errors * add prep.artifact insertion * addressing @ElDeveloper comments * fix artifact_definition active command * != -> == * Added three tutorial sections to the Qiita documentation (#3032) * Added three tutorial sections to the Qiita documentation: 'Retrieving Public Data for Own Analysis' and 'Processing public data retrieved with redbiom' to the redbiom tab, and 'Statistical Analysis to Justify Clinical Trial Sample Size Tutorial' to the analyzing samples tab. * Update redbiom.rst * Update redbiom.rst * Update redbiom.rst * Further updates to redbiom.rst and the Stats tutorial. * update redbiom.rst * Finished proof-reading * Placed all three tutorials/sections together under Introduction to the download and analysis of public Qiita data * added a new introduction, with links to the three sections * Added figures to stats tutorial and contexts explanation * Added figures to stats tutorial and contexts explanation * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Antonio Gonzalez <antgonza@gmail.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * 092020 (#3033) * 092020 * connect artifact with job * rm INSERT qiita.artifact_processing_job * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * rm create_qiime_mapping_file * fixing some tests * fixing more tests * fix even more tests * rm npt.assert_warns * qiime-map -> sample-file * update not_merged_samples.txt * adding @ElDeveloper changes Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
* Version 092020 (#3034) * inject study_type in EBI and improvements to current automatic processing pipeline (#3023) * inject study_type in ebi and improvements to current automatic proecssing pipeline * addressing @ElDeveloper comments * some general fixes/additions for next release (#3026) * some general fixes/additions for next release * adding test for not None job.release_validator_job * fix #2839 * fix #2868 (#3028) * fix #2868 * 2nd round * fix errors * more changes * fix errors * fix ProcessingJobTest * fix PY_PATCH * add missing TRN.add * encapsulated_query -> perform_as_transaction * fix #3022 (#3030) * fix #3022 * adding tests * fix #2320 (#3031) * fix #2320 * adding prints to debug * children -> 1 * APIArtifactHandlerTest -> APIArtifactHandlerTests * configure_biom * qdb.util.activate_or_update_plugins * improving code * almost there * add values.template * fix filepaths * filepaths -> files * fixing errors * add prep.artifact insertion * addressing @ElDeveloper comments * fix artifact_definition active command * != -> == * Added three tutorial sections to the Qiita documentation (#3032) * Added three tutorial sections to the Qiita documentation: 'Retrieving Public Data for Own Analysis' and 'Processing public data retrieved with redbiom' to the redbiom tab, and 'Statistical Analysis to Justify Clinical Trial Sample Size Tutorial' to the analyzing samples tab. * Update redbiom.rst * Update redbiom.rst * Update redbiom.rst * Further updates to redbiom.rst and the Stats tutorial. * update redbiom.rst * Finished proof-reading * Placed all three tutorials/sections together under Introduction to the download and analysis of public Qiita data * added a new introduction, with links to the three sections * Added figures to stats tutorial and contexts explanation * Added figures to stats tutorial and contexts explanation * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Antonio Gonzalez <antgonza@gmail.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * 092020 (#3033) * 092020 * connect artifact with job * rm INSERT qiita.artifact_processing_job * Apply suggestions from code review [skip ci] Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> * adding qiita.study autoloaded column * cleaning tests * fix redbiom test * avoiding clog submissions * Apply suggestions from code review Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu> Co-authored-by: Daniel McDonald <danielmcdonald@ucsd.edu> Co-authored-by: Mirte Kuijpers <67341505+mcmk3@users.noreply.github.com> Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
…