Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor? x-bte refactoring: API-level x-bte info #745

Open
colleenXu opened this issue Oct 19, 2023 · 1 comment
Open

minor? x-bte refactoring: API-level x-bte info #745

colleenXu opened this issue Oct 19, 2023 · 1 comment
Labels

Comments

@colleenXu
Copy link
Collaborator

colleenXu commented Oct 19, 2023

[EDIT: this was the original motivation. The next post explains addressing this and more using a new API-level x-bte field]

This is a less-important + probably-easy part of "x-bte refactoring". Previously brought up #656 (comment) under "idea from @tokebe".


Right now, we have a query-handler config to tell BTE to consider differing edge-attribute values when creating separate edges (during hashing).

This has been helpful for Multiomics KPs, where they may have associations that they want to represent as separate edges, even though they have the same subject-predicate-qualifiers-object (see #407, #647).

However, this would be easier to contribute to and maintain if the config was included in each KP's SmartAPI yaml and imported into BTE with the rest of the x-bte annotation information.

Rough draft of what x-bte may look like

In this rough draft, I'm putting this on an operation-by-operation level since that's how x-bte annotation is mostly set up right now. However, this may act on an API-level...

this is a modified snippet of the Multiomics Wellness KP yaml:

  • the operation has a new field fieldsForUniqueEdges
  • use $ref to a separate components section, since that's more compact
  x-bte-fields-for-edge-hash:
    for_multiomics_wellness:    ## 2023-05-25: may later change?
      - "MeSH:D005260"           ## gender female
      - "MeSH:D008297"           ## gender male
      - "UMLS CUI:C0001948"      ## alcohol consumption?
      - "UMLS CUI:C0005680"      ## black population?
      - "UMLS CUI:C0043157"      ## population white
      - "UMLS CUI:C0086409"      ## hispanic population?
      - "UMLS CUI:C0425379"      ## other race?
      - "UMLS CUI:C0453995"      ## tobacco use and exposure?
      - "UMLS CUI:C1515945"      ## American Indian or Alaska Native?
      - "UMLS CUI:C1519427"      ## south asian people?
      - "UMLS CUI:C2229974"      ## children
      - "UMLS CUI:C2698217"      ## middle eastern?
      - "UMLS CUI:C4316909"      ## Marijuana Use?
      - "UMLS CUI:C5205795"      ## east asian people
      - "UMLS CUI:C5418925"      ## study age range
    ## - "NCIT:C61594"    ## bonferroni p-value: would maybe work? but Gwênlyn said not needed
  x-bte-kgs-operations:
    CAS-CHEBI-Rev:
      - fieldsForUniqueEdges: 
          $ref: '#/components/x-bte-fields-for-edge-hash/for_multiomics_wellness'
        useTemplating: True
        outputs:
          - id: CAS
            semantic: SmallMolecule
        requestBodyType: object
        inputs:
          - id: CHEBI
            semantic: SmallMolecule
        parameters:
          fields: >-
            subject.CAS,
            association.attributes,
            association.sources,
            subject.name,
            object.name
          size: 1000
        predicate: correlated_with
        response_mapping:
          $ref: '#/components/x-bte-response-mapping/CAS-rev'
        supportBatch: True
        requestBody:
          body: >-
            {"q": [ {{ queryInputs | replPrefix('CHEBI') | wrap( '["', '","biolink:SmallMolecule"]' ) }} ],
            "scopes": ["object.CHEBI", "subject.type"]}


Side notes: it's unclear to me whether the "consideration of differing edge-attributes"/"edge hash" feature works for KPs that don't use the edge-attributes keyword in their x-bte-response-mapping (all Multiomics KPs use the edge-attributes keyword). I get the sense that that it'd be helpful for other KPs (unresolved "edge-merging" issue with PharmGKB #556 (comment))

@colleenXu colleenXu added enhancement New feature or request x-bte labels Oct 19, 2023
@colleenXu colleenXu changed the title minor? x-bte refactoring: import edge-attributes for hashing minor? x-bte refactoring: include edge-attributes for hashing Oct 19, 2023
@colleenXu colleenXu changed the title minor? x-bte refactoring: include edge-attributes for hashing minor? x-bte refactoring: include edge-attributes used for edge hash (unique edges) Oct 19, 2023
@colleenXu colleenXu changed the title minor? x-bte refactoring: include edge-attributes used for edge hash (unique edges) minor? x-bte refactoring: API-level x-bte info Oct 21, 2023
@colleenXu
Copy link
Collaborator Author

colleenXu commented Oct 21, 2023

During discussion with @tokebe Jackson today, we agreed:

  • it doesn't make sense to attach the "edge-attributes list" to every operation - which is what I showed in my rough draft above
  • Instead, this is an example of "API-level" information that BTE needs.

So we now propose creating an optional x-bte field for "API-level" information under the info section (same level as x-translator and x-trapi).


My thinking after the discussion today:

A. Previously-discussed "API-level" info that could go into that new x-bte field (as optional fields):

B. Here's a rough draft example for the first two kinds of info, using Multiomics Wellness and the field name x-bte-info

click to expand

this is a modified snippet of the Multiomics Wellness KP yaml:

---
info:
  x-bte-info:
    batch-size: 100
    fields-for-unique-edges:    ## 2023-05-25: may later change?
      - "MeSH:D005260"           ## gender female
      - "MeSH:D008297"           ## gender male
      - "UMLS CUI:C0001948"      ## alcohol consumption?
      - "UMLS CUI:C0005680"      ## black population?
      - "UMLS CUI:C0043157"      ## population white
      - "UMLS CUI:C0086409"      ## hispanic population?
      - "UMLS CUI:C0425379"      ## other race?
      - "UMLS CUI:C0453995"      ## tobacco use and exposure?
      - "UMLS CUI:C1515945"      ## American Indian or Alaska Native?
      - "UMLS CUI:C1519427"      ## south asian people?
      - "UMLS CUI:C2229974"      ## children
      - "UMLS CUI:C2698217"      ## middle eastern?
      - "UMLS CUI:C4316909"      ## Marijuana Use?
      - "UMLS CUI:C5205795"      ## east asian people
      - "UMLS CUI:C5418925"      ## study age range
    ## - "NCIT:C61594"    ## bonferroni p-value: would maybe work? but Gwênlyn said not needed
  x-translator:
    infores: "infores:biothings-multiomics-wellness"
    component: KP
    biolink-version: '3.1.1'
    team:
      - Multiomics Provider
      - Service Provider
  version: '1.7'
  title: Multiomics Wellness KP API
  termsOfService: https://biothings.io/about
  contact:
    email: gglusman@isbscience.org
    name: Gwenlyn Glusman
    x-role: responsible developers
    x-id: https://github.com/biothings
  description: Documentation of the BioThings API for Translator Multiomics Team's  Wellness KP.

C. I'm still unclear on whether the "edge hashing" works for all scenarios. See my "side note" at the bottom of the previous post

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant