Synonym lookup super slow? How to fix? #2367

edeutsch · 2024-09-07T23:21:48Z

I've noticed this for a while, but only posting now. Has anyone noticed that the Synonym lookup through the ARAX GUI is super slow? Try searching for metformin or ibuprofen or anything reasonably common, and I start hearing my CPU fans groaning and it takes 15+ seconds for something to appear. I assume this is either because so much data is returned or rendering the graph is so expensive or? I wonder if anyone else has this issue? And if anyone has ideas on how best to solve it? Return less data? Don't render the graph unless asked? This service was great when answers came back within a second, but now it's painful to use.

ideas?

amykglen · 2024-09-08T03:49:40Z

yes, this started happening after we started using the SRI Node Normalizer's drug_chemical_conflate parameter, which made the clusters for certain drugs really big.

I definitely think it's the 'match graph' that's causing the issue (I think the acetaminophen graph has 10s of thousands of edges now) - I wonder if we could just not display the graph if it has more than some reasonable number of edges? not sure if there's an existing way to determine the number of edges without actually having to load all of them..

isbluis · 2024-09-17T01:29:55Z

As a quick test in devLM, looking up metformin results in the following rough timings:

13 seconds to receive JSON response (>77Mb)
35 seconds to render full table, without displaying the Concept Graph
55 seconds to display, including graph

edeutsch · 2024-09-17T05:52:19Z

oof, thanks. Yeah, I think we should put some effort into slimming down the response first somehow. And then maybe something on the front end.

amykglen · 2024-10-12T00:06:24Z

ok, per discussion with @edeutsch and others today - I've added an optional max_synonyms parameter to the NodeSynonymizer's get_normalizer_results() (in master), which you can use like this:

synonymizer.get_normalizer_results(entities=DOID:14330, max_synonyms=2)

and which produces a truncated cluster like this one (I haven't shown the full knowledge_graph below, but it is also truncated to two nodes and only edges that connect those two nodes):

{
  "DOID:14330": {
    "id": {
      "identifier": "MONDO:0005180",
      "name": "Parkinson disease",
      "category": "biolink:Disease",
      "SRI_normalizer_name": "Parkinson disease",
      "SRI_normalizer_category": "biolink:Disease",
      "SRI_normalizer_curie": "MONDO:0005180"
    },
    "total_synonyms": 18,
    "categories": {
      "biolink:Disease": 18
    },
    "nodes": [
      {
        "identifier": "DOID:14330",
        "category": "biolink:Disease",
        "label": "Parkinson's disease",
        "major_branch": "DiseaseOrPhenotypicFeature",
        "in_sri": true,
        "name_sri": "Parkinson's disease",
        "category_sri": "biolink:Disease",
        "in_kg2pre": true,
        "name_kg2pre": "Parkinson&apos;s disease",
        "category_kg2pre": "biolink:Disease"
      },
      {
        "identifier": "MONDO:0005180",
        "category": "biolink:Disease",
        "label": "Parkinson disease",
        "major_branch": "DiseaseOrPhenotypicFeature",
        "in_sri": true,
        "name_sri": "Parkinson disease",
        "category_sri": "biolink:Disease",
        "in_kg2pre": true,
        "name_kg2pre": "Parkinson disease",
        "category_kg2pre": "biolink:Disease"
      }
    ],
    "knowledge_graph": {
      "nodes": {
        ...

so we were thinking the UI can decide how many nodes is reasonable to display in one cluster (e.g., 200?), and then call get_normalizer_results() with that number as max_synonyms. and maybe also provide a dropdown or the like that lets a user increase max_synonyms.

note that the top-level "categories" slot shown above that reports node counts by category includes counts for the full cluster, and I also added a top-level "total_synonyms" slot to make it easy to report how many nodes are in the full cluster.

let me know if I can do anything else!

edeutsch · 2024-10-30T22:59:20Z

back end now supports max_synonyms=N in a POSTed dict.

isbluis · 2024-10-31T16:07:14Z

The parameter has been added to the Settings pane of the UI (in devLM), and a warning is given when the output is truncated:
https://arax.ncats.io/devLM/index.html?term=aspirin

…[q] (#2400) - Add new user setting for max number of nodes to list in Synonyms [default=50] (#2367) - Tidy layout of Settings - Add "months" to text for long-running Queries in System Activity - Update ARAXi DSL helper JSON

edeutsch · 2024-11-01T03:42:36Z

Fixed in master by limiting the number of nodes.

Although I now belatedly wonder if that was really the right way to fix it.
What if I really want to see the 300 synonyms of ibuprofen without crashing my browser. Surely a table of 300 things isn't so bad. Maybe it would have been better just to suppress the knowledge graph and that would have fixed it?

… 1000 (from 50) [#2367]

amykglen added a commit that referenced this issue Oct 12, 2024

Add 'max_synonyms' param to get_normalizer_results() #2367

7ebd333

edeutsch added a commit that referenced this issue Oct 30, 2024

update back end to process max_synonyms in the POSTed dict. #2367

5d10fbe

edeutsch assigned isbluis Oct 30, 2024

edeutsch mentioned this issue Nov 1, 2024

Changelog since 2024-10-17 the deployment to TEST: Sprint 7 changes for Isopod release #2404

Open

edeutsch closed this as completed Nov 1, 2024

isbluis added a commit that referenced this issue Nov 6, 2024

Add checkbox to show/suppress Concept Graph in Synonyms; reset max to…

857ade8

… 1000 (from 50) [#2367]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Synonym lookup super slow? How to fix? #2367

Synonym lookup super slow? How to fix? #2367

edeutsch commented Sep 7, 2024

amykglen commented Sep 8, 2024

isbluis commented Sep 17, 2024

edeutsch commented Sep 17, 2024

amykglen commented Oct 12, 2024

edeutsch commented Oct 30, 2024

isbluis commented Oct 31, 2024

edeutsch commented Nov 1, 2024

Synonym lookup super slow? How to fix? #2367

Synonym lookup super slow? How to fix? #2367

Comments

edeutsch commented Sep 7, 2024

amykglen commented Sep 8, 2024

isbluis commented Sep 17, 2024

edeutsch commented Sep 17, 2024

amykglen commented Oct 12, 2024

edeutsch commented Oct 30, 2024

isbluis commented Oct 31, 2024

edeutsch commented Nov 1, 2024