Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-638845: ValueError: unsupported format character with version 1.4.0 #327

Closed
jonashaag opened this issue Jul 28, 2022 · 15 comments · Fixed by #355
Closed

SNOW-638845: ValueError: unsupported format character with version 1.4.0 #327

jonashaag opened this issue Jul 28, 2022 · 15 comments · Fixed by #355
Labels
bug Something isn't working needs triage

Comments

@jonashaag
Copy link

Environment:

arrow-cpp                             7.0.0  py310hc7d0c5d_8_cpu  ...     Cached
...
python                               3.10.5  h582c2e5_0_cpython   ...     Cached
...
snowflake-connector-python            2.7.9  py310hbc1ed79_0      ...     Cached
snowflake-sqlalchemy                  1.4.0  pyhd8ed1ab_0         ...     Cached
...
sqlalchemy                           1.4.39  py310h5764c6d_0      ...     Cached

Running some unit tests, excerpt from the logs:

2022-07-28T10:41:44.0418249Z         if self._connection.is_pyformat:
2022-07-28T10:41:44.0418388Z             # pyformat/format paramstyle
2022-07-28T10:41:44.0418503Z             # client side binding
2022-07-28T10:41:44.0418726Z             processed_params = self._connection._process_params_pyformat(params, self)
2022-07-28T10:41:44.0419125Z             # SNOW-513061 collect telemetry for empty sequence usage before we make the breaking change announcement
2022-07-28T10:41:44.0419295Z             if params is not None and len(params) == 0:
2022-07-28T10:41:44.0419604Z                 self._log_telemetry_job_data(
2022-07-28T10:41:44.0419918Z                     TelemetryField.EMPTY_SEQ_INTERPOLATION,
2022-07-28T10:41:44.0420130Z                     TelemetryData.TRUE
2022-07-28T10:41:44.0420454Z                     if self.connection._interpolate_empty_sequences
2022-07-28T10:41:44.0420715Z                     else TelemetryData.FALSE,
2022-07-28T10:41:44.0420905Z                 )
2022-07-28T10:41:44.0421279Z             if logger.getEffectiveLevel() <= logging.DEBUG:
2022-07-28T10:41:44.0421455Z                 logger.debug(
2022-07-28T10:41:44.0421668Z                     f"binding: [***self._format_query_for_log(command)***] "
2022-07-28T10:41:44.0421818Z                     f"with input=[***params***], "
2022-07-28T10:41:44.0421992Z                     f"processed=[***processed_params***]",
2022-07-28T10:41:44.0422086Z                 )
2022-07-28T10:41:44.0422189Z             if (
2022-07-28T10:41:44.0422369Z                 self.connection._interpolate_empty_sequences
2022-07-28T10:41:44.0422520Z                 and processed_params is not None
2022-07-28T10:41:44.0422622Z             ) or (
2022-07-28T10:41:44.0422807Z                 not self.connection._interpolate_empty_sequences
2022-07-28T10:41:44.0422954Z                 and len(processed_params) > 0
2022-07-28T10:41:44.0423054Z             ):
2022-07-28T10:41:44.0423205Z >               query = command % processed_params
2022-07-28T10:41:44.0423664Z E               ValueError: unsupported format character '_' (0x5f) at index 125

Not happening with < 1.4.0.

@jonashaag jonashaag added bug Something isn't working needs triage labels Jul 28, 2022
@sfc-gh-mkeller
Copy link
Collaborator

recreate jira

@github-actions github-actions bot changed the title ValueError: unsupported format character with version 1.4.0 SNOW-638845: ValueError: unsupported format character with version 1.4.0 Aug 2, 2022
@sfc-gh-aling
Copy link
Collaborator

hey @jonashaag , do you mind sharing unit tests with us so that we could investigate and fix. In 1.4.0 we were doing trying to address the double percent escape by doing string interpolation which is not done in versions < 1.4.0

@jonashaag
Copy link
Author

Unfortunately I don't have a reproducing examle that I can share

@sfc-gh-aling
Copy link
Collaborator

from the error info ValueError: unsupported format character '_' (0x5f) at index 125, empty string interpolation like "%_somestr" % {} would trigger the error as compared to the behavior in < 1.4.0 which would not do empty string interpolation. But that leaves the question why the percent sign in "%_somestr" is not escaped at the beginning.

I feel this could be related to special naming like percent sign in table/col names. @jonashaag , could you share the exception stack trace so that we can track how the command is generated.

@jonashaag
Copy link
Author

/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1365: in execute
    return self._exec_driver_sql(
/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1669: in _exec_driver_sql
    ret = self._execute_context(
/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1943: in _execute_context
    self._handle_dbapi_exception(
/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/engine/base.py:2128: in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/util/compat.py:208: in raise_
    raise exception
/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1900: in _execute_context
    self.dialect.do_execute(
/home/runner/.conda/envs/X/lib/python3.10/site-packages/sqlalchemy/engine/default.py:732: in do_execute
    cursor.execute(statement, parameters)
/home/runner/.conda/envs/X/lib/python3.10/site-packages/snowflake/connector/cursor.py:681: in execute
    query = command % processed_params
E   ValueError: unsupported format character '_' (0x5f) at index 125

@sfc-gh-aling
Copy link
Collaborator

thanks @jonashaag , I have tried multiple test cases but still failed to reproduce, I will keep trying.

in the meanwhile, if you happen to know which exact case is failing in your unit tests, please share it with me. thanks!

@aniket-jain9
Copy link

aniket-jain9 commented Oct 19, 2022

hi @sfc-gh-aling , I'm also facing the same issue:
ValueError: unsupported format character 'T' (0x54) at index 838
this is beacuse:
when an sql query is passed from python to snowflake to fetch the result in pandas dataframe, it throws error if sql query contains like '%TU%' in my case.

for eg:
I'm writing a sample code:

url = URL(
user = sfuser,
password = sfpassword,
account = sfaccount,
warehouse = sfwarehouse,
database = sfdatabase,
role = sfrole
)
engine = create_engine(url)
connection = engine.connect()

def fetch_data(sql_query):
df_out = pd.read_sql(sql_query, connection)
return df_out

q1 = r"""SELECT id as loan, lead_id, sanctioned_date where lead_id not like '%TU%' and
lead_id <> '' and lead_id is not null
and sanctioned_date between dateadd('day', -7 , current_date()) and current_date;"""
df = fetch_data(q1)

you can create your own database and check.

This query works fine in snowflake and returns the desired output but while fetching from python, it throws this error.
This seems because of compatibility issues with pyarrow.

PS: I'm not getting this error if i remove (lead_id not like '%TU%') and I'm using snowflake-sqlalchemy 1.4.43 (most updated version)

Thanks
Aniket Jain

screenshots below for your quick reference:
image

@sfc-gh-aling
Copy link
Collaborator

thanks for sharing me with your test case @aniket-jain9 ! I will dig into it and let you guys know my findings.

@sfc-gh-aling
Copy link
Collaborator

thanks for your patience, here're some updates from my side:

I believe the issue is that in some cases (e.g., the pandas sample), the operation to unset the flag connection. _interpolate_empty_sequences in post_exc was not executed.

From what I have observed with the pandas sample, pd.read_sql is composed of two separate sqls:

First, it would send a query to check whether this is a table using the exactly query SELECT id as loan, lead_id, sanctioned_date where lead_id not like '%TU%' ... which will raise an error if table does not exist leading the sqlalchemy post_exc not being executed.

The sql will be compiled by sqlalchemy first, so that in the pre_exc, we will set the connection. _interpolate_empty_sequences to true.

Since post_exc is not executed, the flag remains true. This will introduces the problem if the following sqls are not compile -- because percents are not escaped, however, we're still trying to interpolate empty sequences.

I have already make a PR for the fix to rely on the compiled status of sql instead of post_exc to set the flag: #355

@jtilly
Copy link

jtilly commented Nov 4, 2022

I have another example for this issue: I could pin down a specific version of sqlalchemy (not snowflake-sqlalchemy) that introduced this.

Take this file and call it t.py:

import os

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine(
    "snowflake://{user}:{password}@{account}/{database}?warehouse={warehouse}&role={role}".format(
        user=os.environ["SNOWFLAKE_USER"],
        password=os.environ["SNOWFLAKE_PASSWORD"],
        account=os.environ["SNOWFLAKE_ACCOUNT"],
        database=os.environ["SNOWFLAKE_DATABASE"],
        warehouse=os.environ["SNOWFLAKE_WAREHOUSE"],
        role=os.environ["SNOWFLAKE_ROLE"],
    )
)
print(pd.read_sql("SELECT 1600 % 400 AS a, 1599 % 400 as b;", engine))

And then create two environments with different versions of sqlalchemy:

mamba create -n sqlalchemy1423 snowflake-sqlalchemy pandas 'sqlalchemy=1.4.23' -y
mamba create -n sqlalchemy1424 snowflake-sqlalchemy pandas 'sqlalchemy=1.4.24' -y

For sqlalchemy=1.4.23, I get:

→ conda run -n sqlalchemy1423 python t.py
   a    b
0  0  399

For sqlalchemy=1.4.24, I get:

→ conda run -n sqlalchemy1424 python t.py
Traceback (most recent call last):
  File "/tmp/t.py", line 16, in <module>
    print(pd.read_sql("SELECT 1600 % 400 AS a, 1599 % 400 as b;", engine))
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/pandas/io/sql.py", line 590, in read_sql
    return pandas_sql.read_query(
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/pandas/io/sql.py", line 1560, in read_query
    result = self.execute(*args)
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/pandas/io/sql.py", line 1405, in execute
    return self.connectable.execution_options().execute(*args, **kwargs)
  File "<string>", line 2, in execute
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 390, in warned
    return fn(*args, **kwargs)
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 3136, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1271, in execute
    return self._exec_driver_sql(
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1575, in _exec_driver_sql
    ret = self._execute_context(
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1842, in _execute_context
    self._handle_dbapi_exception(
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2027, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1799, in _execute_context
    self.dialect.do_execute(
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 717, in do_execute
    cursor.execute(statement, parameters)
  File "/opt/conda/envs/sqlalchemy1424/lib/python3.9/site-packages/snowflake/connector/cursor.py", line 680, in execute
    query = command % processed_params
ValueError: unsupported format character ' ' (0x20) at index 17

ERROR conda.cli.main_run:execute(49): `conda run python t.py` failed. (See above for error)

For 1.4.24, this can be fixed by using %% instead of %. That will then break 1.4.23.

Maybe this helps you to track down the origin of this? Not sure if this is ultimately a snowflake-sqlalchemy issue or a sqlalchemy issue.

@jtilly
Copy link

jtilly commented Nov 7, 2022

Adding to the post above: for recent versions of sqlalchemy, behavior did in fact change with the 1.4.x releases of snowflake-sqlalchemy.

Take the Python snippet from above and create the following environments:

mamba create -y -n snowflake-sqlalchemy134 snowflake-sqlalchemy=1.3.4 sqlalchemy pandas
mamba create -y -n snowflake-sqlalchemy140 snowflake-sqlalchemy=1.4.0 sqlalchemy pandas
mamba create -y -n snowflake-sqlalchemy141 snowflake-sqlalchemy=1.4.1 sqlalchemy pandas

with snowflake-sqlalchemy=1.3.4

→ conda run -n snowflake-sqlalchemy134 python t.py
   a    b
0  0  399

with snowflake-sqlalchemy=1.4.0 (yanked)

 → conda run -n snowflake-sqlalchemy140 python t.py
Traceback (most recent call last):
  File "/tmp/t.py", line 16, in <module>
    print(pd.read_sql("SELECT 1600 % 400 AS a, 1599 % 400 as b;", engine))
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/pandas/io/sql.py", line 590, in read_sql
    return pandas_sql.read_query(
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/pandas/io/sql.py", line 1560, in read_query
    result = self.execute(*args)
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/pandas/io/sql.py", line 1405, in execute
    return self.connectable.execution_options().execute(*args, **kwargs)
  File "<string>", line 2, in execute
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 402, in warned
    return fn(*args, **kwargs)
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3257, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1365, in execute
    return self._exec_driver_sql(
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1669, in _exec_driver_sql
    ret = self._execute_context(
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
    self._handle_dbapi_exception(
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2128, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
    self.dialect.do_execute(
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
  File "/opt/conda/envs/snowflake-sqlalchemy140/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 680, in execute
    query = command % processed_params
ValueError: unsupported format character ' ' (0x20) at index 17

ERROR conda.cli.main_run:execute(49): `conda run python t.py` failed. (See above for error)

with snowflake-sqlalchemy=1.4.1 (and more recent versions)

→ conda run -n snowflake-sqlalchemy141 python t.py
Traceback (most recent call last):
  File "/tmp/t.py", line 16, in <module>
    print(pd.read_sql("SELECT 1600 % 400 AS a, 1599 % 400 as b;", engine))
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/pandas/io/sql.py", line 590, in read_sql
    return pandas_sql.read_query(
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/pandas/io/sql.py", line 1560, in read_query
    result = self.execute(*args)
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/pandas/io/sql.py", line 1405, in execute
    return self.connectable.execution_options().execute(*args, **kwargs)
  File "<string>", line 2, in execute
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 402, in warned
    return fn(*args, **kwargs)
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3257, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1365, in execute
    return self._exec_driver_sql(
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1669, in _exec_driver_sql
    ret = self._execute_context(
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
    self._handle_dbapi_exception(
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2128, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
    self.dialect.do_execute(
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
  File "/opt/conda/envs/snowflake-sqlalchemy141/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 680, in execute
    query = command % processed_params
ValueError: unsupported format character ' ' (0x20) at index 17

ERROR conda.cli.main_run:execute(49): `conda run python t.py` failed. (See above for error)

@sfc-gh-aling
Copy link
Collaborator

@jtilly thanks for reaching out and sharing us with the reproducing scripts.
I've updated our test suites to include yours, ran the fix against your scripts and the fix worked as expected.
We'll do a bug fix release soon.

@sfc-gh-aling
Copy link
Collaborator

hey folks, we have released 1.4.4 including the fix, could you guys take a try on the latest version to see if it fixes the issue for you?

you can upgrade by pip install -U snowflake-sqlalchemy
thanks for your contribution and patience!

@jtilly
Copy link

jtilly commented Nov 25, 2022

1.4.4 fixes the issue for me. Thank you!

@sfc-gh-aling
Copy link
Collaborator

closing the issue as it was fixed in 1.4.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
5 participants