DICE: Data Integration, Cleaning, and Extraction Benchmark

Knowledge graphs (KGs) are a core component of applications ranging from search to personal assistants. However, learning representations for KG entities is a challenging problem, with implications for a wide range of knowledge graph construction and reasoning tasks.

We introduce the Data Integration, Cleaning, and Extraction (DICE) Benchmark: a collection of resources for developing and studying knowledge graph representations in multi-task settings. DICE consists of 12 tasks over 3 KG datasets, spanning a range of task types (regression, classification, retrieval).

This repository contains:

Scripts for downloading the DICE datasets
Supporting code for loading datasets and evaluating task performance
Example notebooks with baseline results for each of the DICE tasks

To download the datasets, simply run:

python download_data.py

For more information about DICE, refer to https://neelguha.github.io/dice/index.html

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.DS_Store		.DS_Store
.gitignore		.gitignore
Food (Construction) .ipynb		Food (Construction) .ipynb
Food (Reasoning).ipynb		Food (Reasoning).ipynb
README.MD		README.MD
Random Music (Construction).ipynb		Random Music (Construction).ipynb
Random Music (Reasoning).ipynb		Random Music (Reasoning).ipynb
download_data.py		download_data.py
knowledge_graph.py		knowledge_graph.py
task_utils.py		task_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DICE: Data Integration, Cleaning, and Extraction Benchmark

About

Releases

Packages

Languages

neelguha/dice-release

Folders and files

Latest commit

History

Repository files navigation

DICE: Data Integration, Cleaning, and Extraction Benchmark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages