Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HCR terms are duplicated but have 'hcr' in the slot name #664

Open
mslarae13 opened this issue Oct 18, 2023 · 14 comments
Open

HCR terms are duplicated but have 'hcr' in the slot name #664

mslarae13 opened this issue Oct 18, 2023 · 14 comments
Labels
1-TermUpdate Update suggestion for existing term, including bugs. Issues from "cig-bug" label moved here.

Comments

@mslarae13
Copy link
Contributor

Describe the bug
A clear and concise description of what the bug is.

Extensions should NOT have their own slots / terms for every metadata field. For example, when measuring temperature of the sample, you should use the term 'temp' .. not make an extension specific term like hcr_term .

And with the expansion if LinkML IF there is need to say "This is specific for this extension in this way" you use slot usage.

Expected behavior
A clear and concise description of what you expected to happen.

Slot usage should be implemented and slots should be evaluated if they're repeated.

@mslarae13 mslarae13 added the bug label Oct 18, 2023
@mslarae13
Copy link
Contributor Author

not sure if this is a bug, but seemed right? Cuz it's not 1 slot specific.. I just referenced 1 slot.

@turbomam
Copy link
Member

turbomam commented Oct 18, 2023

@mslarae13 I think you meant _temp, not _term in your initial comment

not make an extension specific term like hcr_temp .

@cmungall
Copy link
Contributor

I agree in principle across all terms, but it needs to be clear whether a property refers to a sample or some aspect of the environment in which the sample was collected. These may not be the same.

However, this should not be done by prefixing with the extension or something specific like hcr, rather it should be consistent use of prefixes like sample_ and environmental_

@mslarae13
Copy link
Contributor Author

mslarae13 commented Oct 19, 2023

Other slots to review for redundancy

samp_transport_cond vs samp_transport_temp

  • Both of these have temp... 1 has temp + duration

  • samp_transport_temp
    -- FoodAnimalAndAnimalFeed
    -- FoodFoodProductionFacility
    -- FoodHumanFoods

  • samp_transport_cond
    -- HydrocarbonResourcesCores
    -- HydrocarbonResourcesFluidsSwabs


@only1chunts only1chunts added the 3-CIG Issues that should be handled by the CIG label Oct 19, 2023
@only1chunts
Copy link
Member

Other slots to review for redundancy

samp_transport_cond vs samp_transport_temp

Can we keep one ticket to one bug/suggested update, I think this ticket started out as remove duplicate 'HCR_temp' and replace with 'temp' in relevant extensions.
I believe thats a good call and should be done. I've added the CIG review label

@turbomam
Copy link
Member

we can always make grouping issues, where the first comment contains a checklist like

  • wake up
  • drink favorite morning beverage
  • solve world's problems

Then each of those items will have a bulls-eye like button to the right that converts it into an individual issue.

implementation:

- [ ] wake up
- [ ] drink favorite morning beverage
- [ ] solve world's problems

@lschriml
Copy link
Member

lschriml commented Oct 19, 2023 via email

@turbomam
Copy link
Member

@lschriml thanks for the feedback on this particular issue. Can you please help address the more general issue raised by @cmungall above?

@turbomam
Copy link
Member

In my humble opinion, worrying about backwards compatibility withhcr_temp may not be justified. It does not appear anywhere in LBL's July 2023 SQL dump of NCBI Biosmaple, as an attribute_name or a harmonized_name.

I think this is an equivalent search though the NCBI Biosample web interface, but I am not an expert on that: https://www.ncbi.nlm.nih.gov/biosample/?term=hcr_temp%5BAttribute%5D

Are there other databases that I should be looking in?

@lschriml
Copy link
Member

lschriml commented Oct 19, 2023 via email

@lschriml
Copy link
Member

lschriml commented Oct 19, 2023 via email

@turbomam
Copy link
Member

@mslarae13 I forgot to say that I've created some tools that might simplify researching an issue like this. I haven't advertised them much because they're not totally ready for prime time.

  1. download mixs_derived_class_term_schemasheet.tsv
  2. open in a spreadsheet editor
  3. remove schema-sheets specific rows 2-4
  4. turn on your spreadsheet's auto-filtering mode
  5. filter the keywords column (S) on the text "temperature"
  6. you will get a report that should theoretically include all slots whose titles include the word temperature, or something synonymous
  • air_temp
  • air_temp_regm
  • annual_temp
  • avg_temp
  • cons_food_stor_temp
  • ferm_temp
  • hcr_temp
  • host_body_temp
  • samp_store_temp
  • samp_transport_temp
  • season_temp
  • soil_temp
  • study_inc_temp
  • surf_temp
  • temp
  • temp_out
  • tvdss_of_hcr_temp
  • water_temp_regm

One part of "not completely ready for prime time" is that I added the keywords with a text-mining/human curation approach. I have suggested that we should have a discussion about ongoing keyword maintenance.

@turbomam
Copy link
Member

I think most or all of those slots should have a see_also to temp and a comment that explains how they are different from temp

And I think we should follow that pattern for all other clusters of similar slots.

@turbomam
Copy link
Member

And I think they should all follow the same validation pattern in the absence of some traceable reason.

@ramonawalls ramonawalls added 1-TermUpdate Update suggestion for existing term, including bugs. Issues from "cig-bug" label moved here. and removed 3-CIG Issues that should be handled by the CIG cig-bug labels Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1-TermUpdate Update suggestion for existing term, including bugs. Issues from "cig-bug" label moved here.
Projects
None yet
Development

No branches or pull requests

6 participants