Skip to content

[Feature Request]: Integrating External KG with an Ontology #1815

@gutihernandez

Description

@gutihernandez

Do you need to file an issue?

  • I have searched the existing issues and this feature is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

Is your feature request related to a problem? Please describe.

I would like to know whether it is possible to integrate an external KG that has an ontology and contains encyclopedic knowledge. My KG is pretty big so I cannot run indexing as the costs would be huge and no need to create a graph as I already have correct/encyclopedic triples.

For instance, for a medical use-case, lets say a hospital indexes their own data. This data would include patient records with clear definitions of treatments, diseases, prescribed drugs and patient biomarkers.

It can be very useful if this data can be connected to:

  • Gene Ontology
  • DrugBank
  • KEGG

Above list contains only several graphs that are publicly available.

All of the above graphs have a clear definition of each node type and it contains curated and encyclopedic knowledge.

Describe the solution you'd like

Option A: Graph Federation

  • Maintain both graphs separately but create a federation layer
  • Implement a specific querying strategy that handles both graphs
  • Develop a result integrator to merge responses of context-data from both sources
  • This option seems like the easiest approach to integrate but will have its own problems as a federation layer is required.

Option B: Knowledge Graph Linking

  • Create explicit links between entities in both graphs (graphrag vs external KG)
  • Implement "same-as" or equivalent relationships
  • Option B seems the best trade-off. Option B would require: "Entity Matching" between two graphs.

Option C: Graph Integration

  • Export external encyclopedic graph data in a format compatible with GraphRAG (e.g. text or csv filetype)
  • Use GraphRAG's indexing to import external KG's entities and relationships
  • This option seems not feasible given that the external graph can be HUGE and contains millions of triples that might disrupt the focus of querying and would also be very costly during indexing.

Embeddings of External Graph:

  • This aspect also needs to be considered but this might be a separate issue

Additional context

If it is possible to connect this data into GraphRAG, a lot of features - insights and treatments can be unlocked.

I believe this problem can be tackled in so many different ways. Based on different graph merging strategies, querying strategy would also needs to be adapted. I do not want to limit the discussion into a specific use-case and am wondering how these brilliant community would brainstorm on this issue :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions