This project implements
- A process to convert relational databases to property graph databases.
- A process to translate SQL queries to Cypher queries by parsing, tokenizing SQL queries and converting SQL queries to Cypher queries, adapted from the TabularSemanticParsing.
- Cy-Spider: Semantic Parsing Corpus and Baseline Models for a Property Graph.
The main requirements are:
- Python 3.6+
- Neo4J Community Edition
We recommend using virtual environments to run this code:
python -m virtualenv venv
source venv\bin\activate
Python packages can be installed via:
git clone https://github.com/22842219/SemanticParser4Graph.git
pip install torch torchvision
python3 -m pip install -r requirements.txt
or
pip install -r requirements.txt
Text-to-Spider benchmarks, e.g., Spider, KaggleDBQA, and BIRD. We use the Spider benchmark to illustrate the process.
Download the pre-processed data release, and unzip the folder.
put the data into application/data/spider
.
Note: If you would like to preprocess Spider dataset by yourself, please refer to salesforce TabularSemanticParsing
-
Setting the
application/config.ini
file.- Create an
application/.env
file.
GRAPH_PASSWORD=<your-neo4j-password>
- The application that will be run, are determined in the
config.ini
file:
[FILENAMES] root = <path-to->/SemanticParser4Graph benchmark = Spider neo4j_import_folder = <path-to->/neo4j-community-4.4.11/import> neo4j_uri = http://localhost:7474/browser/ neo4j_user = neo4j neo4j_password = <your-neo4j-password>
- Meanwhile, please config Neo4j export path.
cd application/ConverDB.py
Set
_neo4j_export_path = '<path-to->/neo4j-community-4.4.11/import'
inClass ConvertDB
. - Create an
-
Configure
application/conf/db.ini
filespider_path = <path-to->/SemanticParser4Graph/application/data/spider/database database = musical [neo4j] port = 7687 host = localhost username = neo4j password = <your-neo4j-password>
-
Running Neo4j
cd <path-to-neo4j-bin> ./neo4j start
-
Constructing a property graph database from any arbitrary relational database schemas directly.
cd application/rel_db2kg python schema2graph.py --<benchmark_dataset-name> --cased
Translate SQL queries to Cypher queries.
cd application/rel_db2kg python sql2cypher.py
-
Running interface
cd application python interface --web_ui
cd semantic_parser
bash experiment-text2cypher.sh
If you find the resource in this repository helpful, please cite
@article{zhao2023rel2graph,
title={Rel2Graph: Automated Mapping From Relational Databases to a Unified Property Knowledge Graph},
author={Zhao, Ziyu and Liu, Wei and French, Tim and Stewart, Michael},
journal={arXiv preprint arXiv:2310.01080},
year={2023}
}
@inproceedings{zhao2023cyspider,
title={CySpider: A Neural Semantic Parsing Corpus with Baseline Models for Property Graphs},
author={Zhao, Ziyu and Liu, Wei and French, Tim and Stewart, Michael},
booktitle={Australasian Joint Conference on Artificial Intelligence},
pages={120--132},
year={2023},
organization={Springer}
}
The web interface is adapted from UNSW SQL2Cypher: https://github.com/UNSW-database/SQL2Cypher.