Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redshift serverless ingestion fails with unexpected token in v0.13.3rc1 #11109

Closed
AndreasHegerNuritas opened this issue Aug 7, 2024 · 0 comments · Fixed by #11111
Closed
Labels
bug Bug report

Comments

@AndreasHegerNuritas
Copy link
Contributor

Describe the bug

I am using a docker quickstart deployment of version 0.13.3rc1 ingesting from AWS Redshift Serverless.

The error is:

...
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/redshift/redshift.py", line 448, in get_workunits_internal                                                                                                                                       
    yield from self.extract_lineage_v2(                                                                                                                                                                                                                                                                           
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/redshift/redshift.py", line 1017, in extract_lineage_v2                                                                                                                                          
    lineage_extractor.build(                                                                                                                                                                                                                                                                                      
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/redshift/lineage_v2.py", line 126, in build                                                                                                                                                      
    table_renames, _ = self._lineage_v1._process_table_renames(                                                                                                                                                                                                                                                   
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/redshift/lineage.py", line 851, in _process_table_renames                                                                                                                                        
    schema, prev_name, new_name = parse_alter_table_rename(                                                                                                                                                                                                                                                       
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/datahub/ingestion/source/redshift/lineage.py", line 131, in parse_alter_table_rename                                                                                                                                      
    parsed_query = sqlglot.parse_one(query, dialect="redshift")                                                                                                                                                                                                                                                   
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/sqlglot/__init__.py", line 139, in parse_one                                                                                                                                                                              
    result = dialect.parse(sql, **opts)                                                                                                                                                                                                                                                                           
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/sqlglot/dialects/dialect.py", line 512, in parse                                                                                                                                                                          
    return self.parser(**opts).parse(self.tokenize(sql), sql)                                                                                                                                                                                                                                                     
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/sqlglot/parser.py", line 1245, in parse                                                                                                                                                                                   
    return self._parse(                                                                                                                                                                                                                                                                                           
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/sqlglot/parser.py", line 1317, in _parse                                                                                                                                                                                  
    self.raise_error("Invalid expression / Unexpected token")                                                                                                                                                                                                                                                     
  File "/tmp/datahub/ingest/venv-redshift-03575587e416950c/lib/python3.10/site-packages/sqlglot/parser.py", line 1358, in raise_error                                                                                                                                                                             
    raise error                                                                                                                                                                                                                                                                                                   
sqlglot.errors.ParseError: Invalid expression / Unexpected token. Line 1, Col: 151.                                                                                                                                                                                                                               
  _name": "nuritas", "target_name": "prod", "node_id": "model.warehouse.stg_import_monday__boards"} */\nalter table "analytics"."prod_staging"."stg_import_monday__boards" rename to "stg_import_monday__bo```

**To Reproduce**
Steps to reproduce the behavior:
1. Go to 'Ingestion' and setup a redshift serverless instance (note you will need to do this with yaml as the is_serverless parameter is not exposed in the UI.
2. See error

**Expected behavior**
A successfull ingestion.

**Desktop (please complete the following information):**
 - OS: Linux
 - Browser: Firefox
 - Version: NA

The solution is to add the following to:
datahub/ingestion/source/redshift/redshift_schema.py:491

```query_text=row[field_names.index("query_text")].replace(r"\n", "\n")```

I will submit a PR
@AndreasHegerNuritas AndreasHegerNuritas added the bug Bug report label Aug 7, 2024
@AndreasHegerNuritas AndreasHegerNuritas changed the title A short description of the bug Redshift serverless ingestion fails with unexpected token in v0.13.3rc1 Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
1 participant