Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

G4.76 Step 2 --Remove validator TraitImportValues and utilize the 3 newly added validators by the Traits Importer #79

Merged
merged 33 commits into from
Jul 25, 2024

Conversation

carolyncaron
Copy link
Contributor

@carolyncaron carolyncaron commented May 27, 2024

Issue #76

Motivation

After spending time thinking about/discussing how to organize plugin instances for validating the phenotype share importer, we realized that the plugin instance to validate columns in the trait importer could be split up into 3 separate instances. These instances were implemented in PR #77 as EmptyCell, ValueInList, and DuplicateTraits.

The Traits Importer is currently still calling the initial TraitImportValues validator and needs to use the new validators instead. This is expected to break some tests, so those will need to be fixed as well in this PR.

Current Status

  • Fix the failing tests caused by recent merging of PRs into 4.x
    • The same exception was being thrown in multiple tests. This exception was introduced in PR G4.78 fix trait service getters #81:
      Exception: No genus has been set. See setting a genus in the Traits Service and make sure to use a configured genus. To configure a genus or see all configured genus, go to /admin/tripal/extension/tripal-cultivate/phenotypes/ontology.
    • commit 1de6455 fixed some out of sync variable names in DuplicateTraits and updated annotation accordingly
    • commit 7fcb9a3 updated the example files to be case sensitive for the Type column since that change was introduced late in Step 1 PR
    • commit a984cb5 set the genus in DuplicateTraits since the Service Traits needed to be accessed within that validator. Also, the Trait Importer now declares the validation arrays so that they always exist even if validation input types get skipped.
  • Temporarily add the context arrays for the new FILE_ROW validators as a temporary public property in the validator classes. In the importer you will now use $instance->context = $context to set it and in the validator::validateRow() method, remove the 2nd parameter and add a $context = $this->context; line at the top of the method for now.
  • Move the code in formValidate line 214-305 into a new configureValidators method. Create the validator instances just as you had done previously with the scope info but then put them into the validators array described in the design under their new inputType. Specifically, GENUS => metadata, FILE => file, HEADERS => header-row, new FILE_ROW validators become data-row.
  • Update the loops in the formValidate to use the new inputType keys in the array returned from the configureValidators method.

What does this PR do?

  1. Substantially modifies the formValidate() method of TripalCultivatePhenotypesTraitsImporter.php. It now iterates through the input file and calls each existing validator at the correct point (ie. metadata first, header-row for the first line of file, and data-row validators for each additional row of the the file)
    • The validation status array for each data-row validator is now set within formValidate() after iterating through the file, since otherwise a new validation status is returned for each time the validator is called (ie. for each row in the file).
  2. Updates testTraitFormValidation() by adding 2 additional scenarios
    • Scenario 2: 2nd row has an empty value for column "Method Short Name"
    • Scenario 3: First row has an invalid value for column "Type"
  3. Removes the TraitImportValues validator plugin instance and its test function testTraitImportValuePluginValidator() from PluginValidatorTest.php
  4. Created a new configureValidators() method to create each validator instance and to configure using loadAssets (will be deprecated in issue G4.93 Develop a number of setter traits for use with configuring validators. #93) or set the context arrays of the newer validators.

Testing

Automated Testing

Additional scenarios were added to the data provider provideFilesForValidation() in class TraitImporterFormValidateTest:

  1. correct_header_emptycell_method.tsv: A file with a correct header, 1st row of correct data, 2nd row has an empty cell for the required 'Method Short Name' column
  2. correct_header_invalid_datatype.tsv: A file with a correct header, 1st row of data contains an invalid value for the column 'Type' and the 2nd row contains valid data

Manual Testing

  1. Build a docker image + container off of this branch.
  2. Go to localhost (or whichever port you're using for this container) and sign in
  3. Add a new organism by navigating to Tripal > Content > Add Tripal Content and select Organism from the list
  4. Configure the module by going to Tripal > Extensions > Tripal Cultivate Phenotypes Configuration
  5. Go to Tripal > Data Loaders > Tripal Cultivate: Phenotypic Trait Importer
  6. Upload a file! You can find examples in trpcultivate_phenotypes/tests/src/Fixtures/TraitImporterFiles; these are the files used in automated testing of the fromValidate() method.
    NOTE: simple_example.txt is the only file that will pass validation and thus, get uploaded into the system. If you try this one, or modify any file to completely pass validation, keep in mind that existing traits in the database can impact validation later on (in this case, the duplicate traits validator)

@carolyncaron carolyncaron self-assigned this May 27, 2024
@laceysanderson laceysanderson changed the title G4.76 Remove validator TraitImportValues and utilize the 3 newly added validators by the Traits Importer G4.76 Step 2 --Remove validator TraitImportValues and utilize the 3 newly added validators by the Traits Importer Jun 11, 2024
@laceysanderson
Copy link
Member

laceysanderson commented Jun 26, 2024

My thoughts on next steps for this PR based on current status and discussion in #82.

  1. Fix the failing tests caused by the changes to the trait service getters.
  2. Move the code in formValidate line 214-305 into a new configureValidators method. Create the validator instances just as you had done previously with the scope info but then put them into the validators array described in the design under their new inputType. Specifically, GENUS => metadata, FILE => file, HEADERS => header-row, new FILE_ROW validators become data-row.
  3. Temporarily add the context arrays for the new FILE_ROW validators as an optional param to load assets and set a temporary context property in the validator classes. In the validateRow method, remove the 2nd parameter and add a $context = $this->context; line for now.
  4. Update the loops in the formValidate to use the new inputType keys in the array returned from the configureValidators method.

On second thought, do not worry about return value changes in this PR. They will instead be handled in issue #86

@carolyncaron carolyncaron marked this pull request as ready for review July 17, 2024 21:49
@reynoldtan
Copy link
Contributor

reynoldtan commented Jul 19, 2024

Performed the manual test and test the scenarios described.

  • Upload simple example (plain text)
    Trait, method and unit records were saved in the correct cvs and relevant relationships were created

  • Re-uploaded the same file with a different file extension (tsv)
    Frontend reported a duplicate trait.
    image

A comment on the message referring to the failed entry as row. I think line # ie. line 30 or line 2 is more suitable word for pointing out item in a file as with many text editor software and ancestors of this phenotype module.

  • Tested the 2 scenarios invalid type and blank method short name
    The former tested great, frontend captured what went wrong
    image

but the latter gave this error:
image

Lastly, when I upload a file with just the header row, it will proceed to create a Tripal job.

Copy link
Member

@laceysanderson laceysanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Code looks exactly as I expected :-)
✅ Core functionality was verified by @reynoldtan
☑️ Potential bug that Reynold found has been documented in Issue #97 and is not core to this PR so can be ignored here.
✅ Automated tests look good and pass

Ready to merge!

@laceysanderson laceysanderson merged commit 4b70fee into 4.x Jul 25, 2024
8 checks passed
@laceysanderson laceysanderson deleted the g4.76-removeTraitImportValues branch July 25, 2024 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants