Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for non-numeric column type parameters #171

Closed
blazewicz opened this issue Nov 12, 2022 · 3 comments
Closed

Support for non-numeric column type parameters #171

blazewicz opened this issue Nov 12, 2022 · 3 comments
Assignees

Comments

@blazewicz
Copy link

Parser fails on parametrized column type definitions using non-numeric parameters, type parameter is always assumed to be a size associated with the type and parser expects it to be an int.

PostGIS, a pupular PostgreSQL extension for spatial data adds some extra types, for example Geometry, which are parametrized with a subtype (docs).

Example table definition:

CREATE TABLE t1 (
    p Geometry(MultiPolygon, 26918)
);

Tool fails with following traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../site-packages/simple_ddl_parser/parser.py", line 345, in run
    self.tables = self.parse_data()
  File ".../site-packages/simple_ddl_parser/parser.py", line 257, in parse_data
    self.process_line(num != len(lines) - 1)
  File ".../site-packages/simple_ddl_parser/parser.py", line 287, in process_line
    self.process_statement()
  File ".../site-packages/simple_ddl_parser/parser.py", line 292, in process_statement
    self.parse_statement()
  File ".../site-packages/simple_ddl_parser/parser.py", line 300, in parse_statement
    _parse_result = yacc.parse(self.statement)
  File ".../site-packages/ply/yacc.py", line 333, in parse
    return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
  File ".../site-packages/ply/yacc.py", line 1120, in parseopt_notrack
    p.callable(pslice)
  File ".../site-packages/simple_ddl_parser/dialects/sql.py", line 283, in p_column
    self.set_column_size(p_list, p)
  File ".../site-packages/simple_ddl_parser/dialects/sql.py", line 291, in set_column_size
    p[0]["size"] = self.get_size(p_list)
  File ".../site-packages/simple_ddl_parser/dialects/sql.py", line 249, in get_size
    value_0 = int(p_list[-3])
ValueError: invalid literal for int() with base 10: 'MultiPolygon'

Describe the solution you'd like

The assumption about size being the only parameter should be replaced with a list of types for which this is known to be true. In all cases the parser could return an additional field with a list of the parameters associated with the type. This shouldn't be a breaking change.

Example output:

        "tables": [
            {
                "columns": [
                    {
                        "name": "p",
                        "type": "geometry",
                        "type_params": ["MultiPolygon", 26918],
                        "size": None,
                        "references": None,
                        "unique": False,
                        "nullable": True,
                        "default": None,
                        "check": None,
                        "on_update": None,
                    }
                ],
                ...
            }
        ],
@blazewicz
Copy link
Author

Here's a sample test case:

from simple_ddl_parser import DDLParser


def test_postgis_geometry():
    ddl = """CREATE TABLE t1 (
        p Geometry (MultiPolygon, 26918)
    );"""
    result = DDLParser(ddl).run(group_by_type=True)
    expected = {
        "tables": [
            {
                "columns": [
                    {
                        "name": "p",
                        "type": "Geometry",
                        "type_params": ["MultiPolygon", 26918],
                        "size": None,
                        "references": None,
                        "unique": False,
                        "nullable": True,
                        "default": None,
                        "check": None,
                    },
                ],
                "primary_key": [],
                "alter": {},
                "checks": [],
                "index": [],
                "partitioned_by": [],
                "tablespace": None,
                "schema": None,
                "table_name": "t1",
            }
        ],
        "types": [],
        "sequences": [],
        "domains": [],
        "schemas": [],
        "ddl_properties": [],
    }
    assert expected == result

and the simplest POC that would make it work:

diff --git a/simple_ddl_parser/dialects/sql.py b/simple_ddl_parser/dialects/sql.py
index 9166bd8..017c8a1 100644
--- a/simple_ddl_parser/dialects/sql.py
+++ b/simple_ddl_parser/dialects/sql.py
@@ -288,7 +288,14 @@ class Column:
             and bool(re.match(r"[0-9]+", p_list[-1]))
             or p_list[-1] == "max"
         ):
-            p[0]["size"] = self.get_size(p_list)
+            try:
+                p[0]["size"] = self.get_size(p_list)
+            except ValueError:
+                p[0]["type_params"] = [
+                    int(param) if param.isnumeric() else param
+                    for param in p_list[2:]
+                    if param != ","
+                ]

     @staticmethod
     def set_property(p: List) -> List:

@xnuinside
Copy link
Owner

@blazewicz you can just open PR with same changes - feel free ) just check that all unit tests are passed & add new test case - I will merge it

@xnuinside xnuinside self-assigned this Nov 20, 2022
@xnuinside
Copy link
Owner

@blazewicz hi, I released version 0.29.0 (https://pypi.org/project/simple-ddl-parser/) with support non-numeric column type parameters https://github.com/xnuinside/simple-ddl-parser/blob/main/tests/test_simple_ddl_parser.py#L3139 - here is a test case for your query. If will be needed anything else - feel free to open new issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants