Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL table incremental escape $ in column name for cursor_path #2061

Closed
mikevanloo opened this issue Nov 13, 2024 · 2 comments · Fixed by #2077
Closed

SQL table incremental escape $ in column name for cursor_path #2061

mikevanloo opened this issue Nov 13, 2024 · 2 comments · Fixed by #2077
Assignees
Labels
bug Something isn't working

Comments

@mikevanloo
Copy link

dlt version

1.3.0

Describe the problem

Slack feed:
https://dlthub-community.slack.com/archives/C04DQA7JJN6/p1731449028965839

Performing incremental ingest of SQL Server table with cursor_path column set to __$start_lsn. When trying to use this column the following error occurs due to parse():

cursor_path="__$start_lsn"

Error:
JsonPathParserError: Parse error at 1:2 near token $ ($)

Attempted to escape the $ by applying single quotes around the column name:

cursor_path="'__$start_lsn'"

This results in the column not being able to be found:

ResourceExtractionError: In processing pipe dbo_t_eventData_CT: extraction of resource dbo_t_eventData_CT in generator table_rows caused an exception: "Cursor column ''__$start_lsn'' does not exist in table 'dbo_t_eventData_CT'"

Expected behavior

Able to escape $ for cursor_path within incremental pipeline for sql.

Steps to reproduce

source = sql_table(
credentials=connection,
schema="cdc",
table="table_a",
incremental=dlt.sources.incremental(
cursor_path="__$start_lsn"
)
)

Operating system

Linux

Runtime environment

Virtual Machine

Python version

3.11

dlt data source

SQL Server

dlt destination

Filesystem & buckets

Other deployment details

No response

Additional information

https://github.com/dlt-hub/dlt/blob/devel/dlt/sources/sql_database/helpers.py#L63-L69

Was able to get it to work by adding .replace("'", "") just to check that this is the source of the bug and the pipline worked once the ' were rermoved.


            try:
                self.cursor_column = table.c[incremental.cursor_path.replace("'","")]
            except KeyError as e:
                raise KeyError(
                    f"Cursor column '{incremental.cursor_path}' does not exist in table aaaaaaaaa"
                    f" '{table.name}'"
                ) from e
@burnash
Copy link
Collaborator

burnash commented Nov 18, 2024

Thanks for reporting this @mikevanloo, I can confirm this is a bug on our end as https://github.com/h2non/jsonpath-ng handles escaping correctly when wrapping the path in quotes. Also it looks like this bug is specific to sql_database source. Will follow-up with a fix.

@mikevanloo
Copy link
Author

@burnash Great thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants