Refactor: Conform variable names to standardized vocabulary (pass#1) #93

tokebe · 2022-03-04T18:54:07Z

This PR is a first-pass attempt to address biothings/biothings_explorer/issues/379.

No actual behavior has been changed. Variable names have been changed to better reflect a set of standardized vocabulary for internal data structures (which should better reflect their purpose and relationship to external data structures).

Due to some data structures passed between modules, this PR requires biothings/call-apis.js/pull/47.

This PR will require additional review and discussion to ensure that there are no additional changes that should be made, etc.

src/query_results.js

ariutta · 2022-03-04T19:19:08Z

src/query_results.js

      // [
-      //   {"inputPrimaryID": "NCBIGene:3630", "outputPrimaryID", "MONDO:0005068"},
-      //   {"inputPrimaryID": "MONDO:0005068", "outputPrimaryID", "PUBCHEM.COMPOUND:43815"}
+      //   {"inputprimaryCurie": "NCBIGene:3630", "outputprimaryCurie", "MONDO:0005068"},


Any reason for lowercase p in primary here?

Also, more generally, is it clear to everyone that inputPrimaryCurie and outputPrimaryCurie refer to nodes? We specify edge and node for QEdge and QNode but not for inputPrimaryCurie. Maybe it's obviously because it says input and output? Not saying we should change it, but just wanted to check.

I'll fix the lowercase. It felt relatively clear to me that input and output are nodes on an edge, however I'd like to hear @colleenXu and @marcodarko's opinions on that. Part of the goal here is to make variable names more obvious so if anyone thinks this should be made more obvious (e.g. inputNodePrimaryCurie) then I'm all for it.

I'd like to hear @marcodarko 's view.

In general, I find the use of input/output to be confusing.....

are they nouns (nodes) or adjectives (descriptive)?

What's the perspective?

Users are giving TRAPI qGraphs with directed qEdges, and those qEdges often have to be "reversed" in actual execution....which then makes it confusing to say what is "input" and "output" related to those qEdges...

I think apiEdges / records are a bit clearer on what is "input" and "output" (it's what ID you give the API and what concept ID you get from the API response).

Do we want to distinguish the two different things above?

There may be more ideas related to input/output that I'm not thinking of (biomedical ID resolver's step of giving an "input" to the SRI ID resolver and receiving "output"....maybe that's a thing?)

The definition of I/O is a bit dynamic from my experience, it depends on the direction it takes based on the edge's subject/object ids. So it's hard to come up with a single name for them. Similarly I was confused by that part when I wanted to refer to them as subject/object instead when I first started working on the edge manager but ended up making sense when you think of it more as a definition that changes based on the direction it takes by default I --> O but can flip to O <-- I

@marcodarko

My questions below might not be helpful....(I'm a bit confused here)

It sounds like you're using subject/object for the qNodes based on the qEdge's direction?

and then you use I/O for the execution of sub-queries / records? where "I" corresponds to subject and "O" to object? Does "I --> O / O <-- I" refer to different ways of executing the qEdges (reversal), or to the directionality of records, or...?

to execution, the part I'm more familiar with. I believe however that same check happens with the query results. Anders checks the reversal to get the I/O from the correct node.

I just don't see a clear way to keep the naming consistent there, unless after everything is done we "reset" everything back to the original query directionality maybe?? eg. totally hypothetical but maybe after all edges are executed the manager might have to execute A <-- B ---> C <-- D but the original graph was A --> B --> C --> D. we could reset it to that to keep the context of I/O unchanged for subsequent
processes

I feel that this may need to be a semi-separate issue to clarify in a later pass.

Leaving this conversation unresolved for ease of reference in the future.

src/query_results.js

ariutta

Overall, this looks good. I left some comments on specific lines with a few questions and observations. We'll want to update the in-line code comments and any JSDocs to be consistent with the new vocab.

I find variable names like consolidatedResult more easy to understand than cResult, but that could just be a personal preference. However, if we do use cResult, then I'd change all forms of consolidatedResult, consolidatedResults, etc. throughout the codebase to be consistent.

@marcodarko has worked with quite a bit of this code and may have feedback too. If @colleenXu and Marco are good with this, then I am as well.

tokebe · 2022-03-04T20:23:25Z

I've responded to some of these comments where further discussion is in order. I'll be working on fixing spots I missed, comments, jsdocs, etc.

The main point seems to be preresult vs unconsolidatedResult and whether to use long-form, short-form, or both -- If the general opinion is to use only one, then I would prefer the long form for clarity. I used a combination so that the function names can adequately hint at what the short-form means, while the short-forms, which appear more often, are slightly quicker to read.

I've used this convention in a couple of places in the code (such as QueryExecutionEdge as a class definition, followed by qXEdge for short-form variable names). I think this works as a convention for both clarity and readability, however I'm open to changing it if people agree that it's probably more confusing than helpful.

colleenXu · 2022-03-08T01:22:25Z

I've made some comments above, sorry for being late >.<

colleenXu · 2022-03-08T01:48:32Z

I checked some queries and it looks like the code still works as-expected.

I think it's helpful to check behavior to make sure we didn't miss something that would create a bug...

tokebe · 2022-03-08T15:32:32Z

I did some limited testing to make sure no basic execution was broken, however I definitely think that once we declare changes to be 'done' more thorough testing will be in order...

tokebe · 2022-03-08T21:15:57Z

I've made changes according to our discussions. Please let me know how it looks now. I'll get back to writing up the list of changed vocab, and additionally will be making another pass to just check comments/etc in various places.

src/query_results.js

colleenXu · 2022-03-09T06:38:48Z

I checked some queries and it looks like the code still works as-expected.

FYI I haven't fully reviewed the vocab yet....I've only been chiming in when Anders or Marco bring something up. In general, I'm trusting Jackson's process since I think the vocab depends on a lot on this internal code (and I talk/work with the "higher level" stuff). I'm fine with changing the vocab I use to reflect changes here...

src/query_results.js

ariutta · 2022-03-09T18:10:41Z

Not saying we should change this now, but the QueryResult class would more accurately be named something like QueryResultsAssembler or TrapiResultsAssembler. I think plural TrapiResultsAssembler makes sense instead of singular TrapiResultAssembler because the output is an array. Maybe we can think about for a future update?

src/query_results.js

tokebe · 2022-03-09T18:39:44Z

Not saying we should change this now, but the QueryResult class would more accurately be named something like QueryResultsAssembler or TrapiResultsAssembler. I think plural TrapiResultsAssembler makes sense instead of singular TrapiResultAssembler because the output is an array. Maybe we can think about for a future update?

@ariutta Honestly I think if there's a time to do it, it's now in this PR, and I agree that it would be a good idea, so I'll push that along with changes addressing your other comments.

tokebe · 2022-03-09T19:30:47Z

@ariutta has stated that this PR is ready as far as he is concerned. Next steps are to test more extensively to ensure nothing has been broken and to compile the finalized list of vocab.

colleenXu · 2022-03-14T04:59:35Z

@tokebe noting a bunch of failed automated tests, when I run "npm test"....I have the vocab-refactor branch for call-apis active as well.

expand for console output

➜ query_graph_handler git:(vocab-refactor) ✗ npm test

@biothings-explorer/query_graph_handler@1.18.0 test
jest --env=node

PASS test/integration/QueryNode.test.js
PASS test/integration/QueryEdge.test.js
FAIL test/integration/graph/graph.test.js
● Test graph class › A single query result is correctly updated.

expect(received).toEqual(expected) // deep equality

Expected: "outputPrimaryID"
Received: undefined

   99 |         expect(g.nodes).toHaveProperty("outputPrimaryID-qg2");
  100 |         expect(g.nodes).toHaveProperty("inputPrimaryID-qg1");
> 101 |         expect(g.nodes["outputPrimaryID-qg2"]._primaryID).toEqual("outputPrimaryID");
      |                                                           ^
  102 |         expect(g.nodes["outputPrimaryID-qg2"]._qgID).toEqual("qg2");
  103 |         expect(Array.from(g.nodes["outputPrimaryID-qg2"]._sourceNodes)).toEqual(['inputPrimaryID-qg1']);
  104 |         expect(Array.from(g.nodes["outputPrimaryID-qg2"]._sourceQGNodes)).toEqual(['qg1']);

  at Object.<anonymous> (__test__/integration/graph/graph.test.js:101:59)

● Test graph class › Multiple query results are correctly updated for two edges having same input, predicate and output

expect(received).toEqual(expected) // deep equality

Expected: "outputPrimaryID"
Received: undefined

  119 |         expect(g.nodes).toHaveProperty("outputPrimaryID-qg2");
  120 |         expect(g.nodes).toHaveProperty("inputPrimaryID-qg1");
> 121 |         expect(g.nodes["outputPrimaryID-qg2"]._primaryID).toEqual("outputPrimaryID");
      |                                                           ^
  122 |         expect(g.nodes["outputPrimaryID-qg2"]._qgID).toEqual("qg2");
  123 |         expect(Array.from(g.nodes["outputPrimaryID-qg2"]._sourceNodes)).toEqual(['inputPrimaryID-qg1']);
  124 |         expect(Array.from(g.nodes["outputPrimaryID-qg2"]._sourceQGNodes)).toEqual(['qg1']);

  at Object.<anonymous> (__test__/integration/graph/graph.test.js:121:59)

● Test graph class › Multiple query results for different edges are correctly updated

expect(received).toEqual(expected) // deep equality

Expected: "outputPrimaryID"
Received: undefined

  146 |         expect(g.nodes).toHaveProperty("outputPrimaryID-qg2");
  147 |         expect(g.nodes).toHaveProperty("inputPrimaryID-qg1");
> 148 |         expect(g.nodes["outputPrimaryID-qg2"]._primaryID).toEqual("outputPrimaryID");
      |                                                           ^
  149 |         expect(g.nodes["outputPrimaryID-qg2"]._qgID).toEqual("qg2");
  150 |         expect(Array.from(g.nodes["outputPrimaryID-qg2"]._sourceNodes)).toEqual(['inputPrimaryID-qg1']);
  151 |         expect(Array.from(g.nodes["outputPrimaryID-qg2"]._sourceQGNodes)).toEqual(['qg1']);

  at Object.<anonymous> (__test__/integration/graph/graph.test.js:148:59)

PASS test/integration/biolink.test.js
FAIL test/integration/KnowledgeGraph.test.js
● Testing KnowledgeGraph Module › Testing _createNode function › test creating node

TypeError: undefined is not iterable (cannot read property Symbol(Symbol.iterator))
    at Function.from (<anonymous>)

  46 |         {
  47 |           attribute_type_id: 'source_qg_nodes',
> 48 |           value: Array.from(kgNode._sourceQNodeIDs),
     |                        ^
  49 |           //value_type_id: 'bts:source_qg_nodes',
  50 |         },
  51 |         {

  at KnowledgeGraph._createNode (src/graph/knowledge_graph.js:48:24)
  at Object.<anonymous> (__test__/integration/KnowledgeGraph.test.js:131:28)

PASS test/unittest/QueryEdge.test.js
FAIL test/unittest/helper.test.js
● Test helper moduler › Test _getInputID function › If edge is reversed, should return the primary ID of the output

TypeError: helper._getInputID is not a function

  136 |                 },
  137 |             }
> 138 |             const res = helper._getInputID(record);
      |                                ^
  139 |             expect(res).toEqual('output')
  140 |         })
  141 |

  at Object.<anonymous> (__test__/unittest/helper.test.js:138:32)

● Test helper moduler › Test _getInputID function › If edge is not reversed, should return the node ID of the subject

TypeError: helper._getInputID is not a function

  162 |                 },
  163 |             }
> 164 |             const res = helper._getInputID(record);
      |                                ^
  165 |             expect(res).toEqual('input')
  166 |         })
  167 |     })

  at Object.<anonymous> (__test__/unittest/helper.test.js:164:32)

● Test helper moduler › Test _getOutputID function › If edge is reversed, should return the node ID of the subject

TypeError: helper._getOutputID is not a function

  191 |                 },
  192 |             }
> 193 |             const res = helper._getOutputID(record);
      |                                ^
  194 |             expect(res).toEqual('input')
  195 |         })
  196 |     })

  at Object.<anonymous> (__test__/unittest/helper.test.js:193:32)

● Test helper moduler › If edge is not reversed, should return the node ID of the object

TypeError: helper._getOutputID is not a function

  218 |             },
  219 |         }
> 220 |         const res = helper._getOutputID(record);
      |                            ^
  221 |         expect(res).toEqual('output')
  222 |     })
  223 |

  at Object.<anonymous> (__test__/unittest/helper.test.js:220:28)

● Test helper moduler › Test _getKGEdgeID function › encountered a declaration exception

TypeError: helper._getKGEdgeID is not a function

  312 |             },
  313 |         }
> 314 |         const res = helper._getKGEdgeID(record);
      |                            ^
  315 |         expect(res).toEqual('b052708d75d94d55916ffce9f0ea3458')
  316 |     })
  317 |

  at Suite.<anonymous> (__test__/unittest/helper.test.js:314:28)
  at Suite.<anonymous> (__test__/unittest/helper.test.js:291:5)
  at Object.<anonymous> (__test__/unittest/helper.test.js:3:1)

● Test helper moduler › Test _getInputEquivalentIdentifiers function › If edge is reversed, should return the curies of the output

TypeError: helper._getInputEquivalentIds is not a function

  618 |                 },
  619 |             }
> 620 |             const res = helper._getInputEquivalentIds(record);
      |                                ^
  621 |             expect(res).toEqual(['789'])
  622 |         })
  623 |

  at Object.<anonymous> (__test__/unittest/helper.test.js:620:32)

● Test helper moduler › Test _getInputEquivalentIdentifiers function › If error occurred, return null

TypeError: helper._getInputEquivalentIds is not a function

  650 |                 },
  651 |             }
> 652 |             const res = helper._getInputEquivalentIds(record);
      |                                ^
  653 |             expect(res).toBeNull;
  654 |         })
  655 |

  at Object.<anonymous> (__test__/unittest/helper.test.js:652:32)

● Test helper moduler › Test _getInputEquivalentIdentifiers function › If edge is not reversed, should return the curies of the subject

TypeError: helper._getInputEquivalentIds is not a function

  682 |                 },
  683 |             }
> 684 |             const res = helper._getInputEquivalentIds(record);
      |                                ^
  685 |             expect(res).toEqual(['123', '456'])
  686 |         })
  687 |     })

  at Object.<anonymous> (__test__/unittest/helper.test.js:684:32)

PASS test/unittest/utils.test.js
PASS test/unittest/LogEntry.test.js
PASS test/integration/QueryGraphHandler.test.js
PASS test/integration/QueryResult.test.js
PASS test/integration/BatchEdgeQueryHandler.test.js
PASS test/integration/QEdge2BTEEdgeHandler.test.js
PASS test/unittest/redisClient.test.js
PASS test/integration/integrity.test.js (52.412 s)
PASS test/integration/TRAPIQueryHandler.test.js (52.806 s)

Test Suites: 3 failed, 1 skipped, 13 passed, 16 of 17 total
Tests: 12 failed, 3 skipped, 137 passed, 152 total
Snapshots: 0 total
Time: 54.561 s

tokebe added 4 commits March 4, 2022 12:44

refactor: vocab normalization first pass

e196a14

refactor: merge latest from main & normalize vocab

ca1838d

fix: typos

f4aa6e6

style: formatting

4e2cc4d

tokebe mentioned this pull request Mar 4, 2022

Refactor: Conform variable names to standardized vocabulary (pass#1) biothings/call-apis.js#47

Merged

tokebe requested review from ariutta and colleenXu March 4, 2022 18:55

ariutta reviewed Mar 4, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 4, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 4, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 4, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 4, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 4, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta approved these changes Mar 4, 2022

View reviewed changes

tokebe requested a review from marcodarko March 4, 2022 19:46

tokebe added 2 commits March 8, 2022 16:06

fix: reference error

05d795a

refactor: additional vocab changes from discussion, fix comments, etc

e5420ad

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 8, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

tokebe added 2 commits March 9, 2022 11:04

refactor: further standardization and typos

9627954

refactor: JSDoc

563874e

ariutta reviewed Mar 9, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

refactor: rm comment

639a44d

ariutta reviewed Mar 9, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

ariutta reviewed Mar 9, 2022

View reviewed changes

src/query_results.js Outdated Show resolved Hide resolved

refactor: comments, QueryResults->TrapiResultsAssembler

6e020c5

tokebe added 8 commits March 9, 2022 16:22

refactor: clarify KG -> metaKG

70ac841

refactor: more edge clarifications

ac3c863

chore: rm old TODOs

e925723

refactor: recordEdgeHash is the same as recordHash

f3e6e61

refactor: qExeEdge -> qXEdge

dcf2e0f

refactor: clarification: currentEdge -> currentQXEdge

c468804

refactor: kg refers to results by default

98d5e76

refactor: conform TRAPI logs to new standard

4ebfa3f

tokebe added 2 commits March 14, 2022 13:24

refactor: log clarification

32c7277

test: update tests with new terms

d8c66cb

tokebe mentioned this pull request Mar 14, 2022

Infer qNode categories and add to user-defined categories where appropriate #94

Merged

tokebe merged commit b07e62a into main Mar 15, 2022

colleenXu mentioned this pull request Oct 12, 2022

Combine qedge #126

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: Conform variable names to standardized vocabulary (pass#1) #93

Refactor: Conform variable names to standardized vocabulary (pass#1) #93

tokebe commented Mar 4, 2022

ariutta Mar 4, 2022

ariutta Mar 4, 2022

tokebe Mar 4, 2022

colleenXu Mar 8, 2022

marcodarko Mar 8, 2022

colleenXu Mar 8, 2022

marcodarko Mar 8, 2022

marcodarko Mar 8, 2022 •

edited

Loading

tokebe Mar 8, 2022

tokebe Mar 8, 2022

ariutta left a comment

tokebe commented Mar 4, 2022

colleenXu commented Mar 8, 2022

colleenXu commented Mar 8, 2022

tokebe commented Mar 8, 2022

tokebe commented Mar 8, 2022

colleenXu commented Mar 9, 2022

ariutta commented Mar 9, 2022 •

edited

Loading

tokebe commented Mar 9, 2022

tokebe commented Mar 9, 2022

colleenXu commented Mar 14, 2022

Refactor: Conform variable names to standardized vocabulary (pass#1) #93

Refactor: Conform variable names to standardized vocabulary (pass#1) #93

Conversation

tokebe commented Mar 4, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcodarko Mar 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariutta left a comment

Choose a reason for hiding this comment

tokebe commented Mar 4, 2022

colleenXu commented Mar 8, 2022

colleenXu commented Mar 8, 2022

tokebe commented Mar 8, 2022

tokebe commented Mar 8, 2022

colleenXu commented Mar 9, 2022

ariutta commented Mar 9, 2022 • edited Loading

tokebe commented Mar 9, 2022

tokebe commented Mar 9, 2022

colleenXu commented Mar 14, 2022

marcodarko Mar 8, 2022 •

edited

Loading

ariutta commented Mar 9, 2022 •

edited

Loading