Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fairly significant changes to check_protocol_fields #531

Merged
merged 2 commits into from
Mar 12, 2024

Conversation

ptth222
Copy link

@ptth222 ptth222 commented Feb 5, 2024

I started editing this function because of the "Only one protocol reference should be used in a Protocol REF column." message(s), but I found some other issues to address as well.

This function raises some valuable questions.

  1. Are different protocols allowed in the same Protocol REF column?
    I think this was pretty much answered as 'Yes' in Different Protocol Names In Study Sequence Cause An Error #501, but it popped up again here.

  2. Do all protocols in the same Protocol REF column have to have the same type?
    This function and the structure of the config files suggest so, but is that actually correct?

  3. Can a cell in a Protocol REF column be blank?
    I can think of at least one example for this. Let's say you collect 2 different types of tissue from the same source. The first step of collection is the same for both, but one tissue type has an extra step as well. This could result in a file with 2 Protocol REF columns, but the second column would only have the protocol for the extra step of 1 of the tissue types. The type without the extra step would be blank.

Some of the changes I made to this function:

  1. Removed the assumption that the same protocol must be in the Protocol REF column.
  2. Removed the return value. (Only 2 checks return a value that is actually used, and this isn't one of those.)
  3. Changed some of the messaging to be a little clearer.
  4. Added a warning to the 'validator' object. Previously it was only printing to the log.
  5. Added a way to bypass config checks if config is malformed or incomplete.

@proccaserra proccaserra requested a review from terazus February 6, 2024 20:51
@proccaserra proccaserra self-assigned this Feb 6, 2024
@proccaserra
Copy link
Member

I started editing this function because of the "Only one protocol reference should be used in a Protocol REF column." message(s), but I found some other issues to address as well.

Brief answer:
-The ISA specifications allow more than on distinct Protocols to be used assuming they have the same protocol_type (e.g. "sample collection").
-The ISA-Tab parser implementation seems to be 'narrower' which explain the rule mentioned above

This function raises some valuable questions.

  1. Are different protocols allowed in the same Protocol REF column?
    I think this was pretty much answered as 'Yes' in Different Protocol Names In Study Sequence Cause An Error #501, but it popped up again here.
  2. Do all protocols in the same Protocol REF column have to have the same type?
    This function and the structure of the config files suggest so, but is that actually correct?

Here, the protocol_type is checked. there is a bit of flex afforded by the tolerance for synonyms which can be specified in a config file found under `isatools/resources/config/yaml/protocol-types.xml but not in a dynamic way so not ideal.

  1. Can a cell in a Protocol REF column be blank?
    I can think of at least one example for this. Let's say you collect 2 different types of tissue from the same source. The first step of collection is the same for both, but one tissue type has an extra step as well. This could result in a file with 2 Protocol REF columns, but the second column would only have the protocol for the extra step of 1 of the tissue types. The type without the extra step would be blank.

Answer:

  • `ISA(-Tab) specifications' allow this implicit coding, in a chain of protocol applications, where some of the steps are omitted.
  • ISA-API implementation may not form the desired graph structure is output Nodes are not explicitly declared, but Protocol REF chained (e.g. Material Node ->Protocol REF->Protocol REF-> Material Node
    vs Material Node ->Protocol REF->Material Node->Protocol REF-> Material Node

These use-cases need more scenario but this would cover the 'optional step' (e.g. is a labeling event happening or not)

Some of the changes I made to this function:

  1. Removed the assumption that the same protocol must be in the Protocol REF column.
  2. Removed the return value. (Only 2 checks return a value that is actually used, and this isn't one of those.)
  3. Changed some of the messaging to be a little clearer.
  4. Added a warning to the 'validator' object. Previously it was only printing to the log.
  5. Added a way to bypass config checks if config is malformed or incomplete.

Thanks @ptth222 ! @terazus or @proccaserra will run the tests.
As discussed last week, we are also working at addressing other issues in the ISA_Tab serializer.
The ISA-JSON is easier to deal with due to the better design.

ptth222 added 2 commits March 11, 2024 12:54
I started editing this function because of the "Only one protocol reference should be used in a Protocol REF column." message(s), but I found some other issues to address as well.
@ptth222 ptth222 force-pushed the check-protocol-fields-update branch from 7b57bdf to 733b74f Compare March 11, 2024 19:27
@ptth222 ptth222 changed the base branch from develop to issue-511 March 11, 2024 19:27
@ptth222
Copy link
Author

ptth222 commented Mar 11, 2024

The tests pass now since rebasing to issue-511.

@coveralls
Copy link

Coverage Status

coverage: 81.257% (-0.03%) from 81.282%
when pulling 733b74f on ptth222:check-protocol-fields-update
into 16ccc00 on ISA-tools:issue-511.

@ptth222
Copy link
Author

ptth222 commented Mar 12, 2024

The coverage difference is from turning a warning that was just printed to log to adding it to the returned error/warning dictionary.

new_coverage_40xx
original_coverage_40xx

@terazus terazus merged commit 7a6ee24 into ISA-tools:issue-511 Mar 12, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants