Skip to content

Automated Data and Query Migration from RDBMS to GDBMS

License

Notifications You must be signed in to change notification settings

UNSW-database/SQL2Cypher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License License passing

Reference

The sql2cypher project is based on 'SQL2Cypher: Automated Data and Query Migration from RDBMS to GDBMS' paper.

Autho list: Shunyang Li, Zhengyi Yang, Xianhang Zhang, Xuemin Lin and Wenjie Zhang.

Corresponding author contact: sli@cse.unsw.edu.au

Attention

Before convert sql to cypher make sure the programming has permission to write files. You can execute the following code to get enough permission for python3:

sudo chmod 777 /var/lib/neo4j/import

Configuration and usage

Usage:

python3 sql2cypher.py --help

Config

Before execute the code, please make sure done the config:

cd conf
vim db.ini

Please complete the config then it works.

Convert SQL to Cypher

There are some examples which come from neo4j

SELECT p.*
FROM products as p;

->

MATCH (p:Product)
RETURN p;

Some more complex examples:

SELECT p.ProductName, p.UnitPrice
FROM products as p
ORDER BY p.UnitPrice DESC
LIMIT 10;

->

MATCH (p:Product)
RETURN p.productName, p.unitPrice
ORDER BY p.unitPrice DESC
LIMIT 10;

In cypher, when we create node in cypher we need node name and label name like CREATE (p:Product). So when execute the sql query, it should like this format: SELECT * FROM Product as p (we need to known the label name).

How to use it

python3 sql2cypher.py -s < sql.txt
# or you can type sql by yourself
python3 sql2cypher.py

In sql.txt you can find more examples. For test whether the syntax correct, you can import the data to MySQL and NEO4J. You can read how to import

Required packages

  1. sqlparse

    Sample example:

    import sqlparse
    sql = 'select * from mytable where id = 1'
    res = sqlparse.parse(sql)
    print(res[0].tokens)
    
    """
    [<DML 'select' at 0x7FA0A944CE20>, <Whitespace ' ' at 0x7FA0A944CE80>, <Wildcard '*' at 0x7FA0A944CEE0>, <Whitespace ' ' at 0x7FA0A944CF40>, <Keyword 'from' at 0x7FA0A944CFA0>, <Whitespace ' ' at 0x7FA0A9454040>, <Identifier 'somesc...' at 0x7FA0A9445CF0>, <Whitespace ' ' at 0x7FA0A94541C0>, <Where 'where ...' at 0x7FA0A9445C80>]
    """
  2. cypher This is the cypher tutorial.

  3. moz-sql-parser It can parser the SQL into a JSON format

    '{"select": [{"value": "p.ProductName"}, {"value": "p.UnitPrice"}], "from": {"value": "products", "name": "p"}, "orderby": {"value": "p.UnitPrice", "sort": "desc"}, "limit": 10}'
  4. py2neo similar with neomodel

How to add relation for tables

Firstly, you need to choose 2. Convert the whole database to cypher and then it will display all the tables which in your database. Then you need to add relation like this format: Table->Table: Relation | Table<->Table: Relation (eg: A->B: Working). -> means one to one relation from a to b, <-> a has relation to b and b has relation to a.

Get relation between tables

SELECT `TABLE_NAME`,  `REFERENCED_TABLE_NAME`, `REFERENCED_COLUMN_NAME`
FROM `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE` 
WHERE `TABLE_SCHEMA` = SCHEMA() 
    AND `REFERENCED_TABLE_NAME` IS NOT NULL;

Show table schema:

SELECT COLUMN_NAME FROM   INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'table_name';

It was stored as dict format and the name of table is key, to is the relation table, on is the foreign key, relation is the relationship between two tables. If the table does not have any relation the value will be None. And it looks like:

[
   {
      "departments":{
         "to":"dept_manager",
         "on":"dept_no",
         "relation":"departments_dept_manager"
      
      }
  },
  {
      "employee": None
  }
]

About

Automated Data and Query Migration from RDBMS to GDBMS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published