Skip to content

Text Mining Provider

karafecho edited this page Sep 13, 2023 · 18 revisions

Back to Home

Brief Overview

The Text Mining Provider KG contains subject-predicate-object assertions derived from the application of natural language processing (NLP) algorithms to the PubMed Central open-access collection of publications plus additional titles and abstracts from PubMed.

Caveat: Text-mined assertions must be interpreted with caution, as NLP algorithms may introduce false assertions.

Example Edge: image

License/Restrictions: None.

URL: http://smart-api.info/registry?q=978fe380a147a8641caf72320862697b; http://smart-api.info/registry?q=71fa2e0f0f1fe1ec67f4ddb719db5ef3

Detailed Description

The Text Mining Provider aims to provide an up-to-date, Biolink-compatible, knowledge graphs (KGs) composed of assertions mined from the available biomedical literature. Two flavors of KGs are provided:

  1. A concept cooccurrence KG where nodes are ontology concepts and links between nodes reflect the cooccurrence of the concepts in text, e.g. in the same sentence or abstract. Edges are scored using the Normalized Google Distance metric.
  2. A KG composed of text-mined assertions where the nodes are ontology concepts and the edges represent explicitly defined BioLink relations between the two concepts.

Team Contact:

Bill Baumgartner

Interfaces

See Available Knowledge Graphs.

Knowledge Sources

Source Code

See Associated Code Repositories.

External Documentation

See the Text Mining Provider roadmap for details on the development status and implementation plans for the NCATS Translator Text Mining Provider. The referenced repository includes, among others:

Clone this wiki locally