Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sqla): make text clause escaping optional #17641

Merged
merged 1 commit into from
Dec 3, 2021

Conversation

villebro
Copy link
Member

@villebro villebro commented Dec 3, 2021

SUMMARY

A recent PR #17419 enabled global escaping of the colon character in queries, as the majority of SQLAlchemy dialects support escaping the colon character (see #17098 why this is needed). Escaping has been verified to work correctly on Postgres, MySql, MSSQL, Sqlite, BigQuery, Druid, Hive and Redshift, so it appears safe to assume escaping is supported on the majority of dialects. However, PyAthena does not, and adds the escape character to the final query, which breaks any query containing a colon character.

To support literal colons in queries, but fix the regression for Athena, this PR adds a flag allows_escaped_colons to BaseEngineSpec, which defaults to True, and is set to False in PyAthenaSpec. Relevant unit tests are added to the BaseEngineSpec and AthenaEngineSpec. In addition, all athena integration tests are converted to unit tests in pytest format.

BEFORE

Currently adding a virtual table with the query select '123:456' as test_str and then adding a Custom SQL query test_str = '123:456' generates the following query on Athena (notice the backslash before the colon characters):

SELECT "test_str" AS "test_str"
FROM
  (select '123\:456' as test_str) AS "virtual_table"
WHERE ((test_str = '123\:456'))
LIMIT 1000

For comparison, the same virtual table + filter produces the following on Postgres + all other tested databases:

SELECT test_str AS test_str
FROM
  (select '123:456' as test_str) AS virtual_table
WHERE ((test_str = '123:456'))
LIMIT 1000

TESTING INSTRUCTIONS

  1. create a virtual table select '123:456' as test_str referencing an Athena database
  2. Create a chart based on the virtual table and add a filter test_str = '123:456'
  3. Notice how the rendered query has escaped both the subquery and where clause

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Dec 3, 2021

Codecov Report

Merging #17641 (829e01e) into master (73e7928) will decrease coverage by 0.18%.
The diff coverage is 88.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17641      +/-   ##
==========================================
- Coverage   68.75%   68.56%   -0.19%     
==========================================
  Files        1595     1595              
  Lines       65184    65191       +7     
  Branches     6945     6945              
==========================================
- Hits        44814    44697     -117     
- Misses      18488    18612     +124     
  Partials     1882     1882              
Flag Coverage Δ
hive ?
mysql 81.80% <88.23%> (+<0.01%) ⬆️
postgres 81.80% <88.23%> (-0.01%) ⬇️
presto ?
python 81.89% <88.23%> (-0.42%) ⬇️
sqlite 81.50% <88.23%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/connectors/sqla/models.py 86.85% <77.77%> (-1.58%) ⬇️
superset/db_engine_specs/athena.py 89.65% <100.00%> (+0.36%) ⬆️
superset/db_engine_specs/base.py 88.29% <100.00%> (-0.26%) ⬇️
superset/db_engines/hive.py 0.00% <0.00%> (-85.19%) ⬇️
superset/db_engine_specs/hive.py 70.27% <0.00%> (-16.99%) ⬇️
superset/db_engine_specs/presto.py 83.50% <0.00%> (-6.89%) ⬇️
superset/reports/commands/log_prune.py 85.71% <0.00%> (-3.58%) ⬇️
superset/commands/importers/v1/utils.py 89.13% <0.00%> (-2.18%) ⬇️
superset/views/database/mixins.py 81.03% <0.00%> (-1.73%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 73e7928...829e01e. Read the comment docs.

@villebro villebro merged commit b2ffa26 into apache:master Dec 3, 2021
@villebro villebro deleted the villebro/fix-colon-quote-3 branch December 3, 2021 10:35
jinghua-qa pushed a commit to preset-io/superset that referenced this pull request Dec 3, 2021
@jinghua-qa
Copy link
Member

🏷️ 2021.46

@joaomsan
Copy link

@villebro do you know if this pr is planned to be included in the next releases?, without this pr, filters that use DateTime columns are failing using aws athena as backend.

Thank you.

@villebro
Copy link
Member Author

@joaomsan Yes, absolutely, it will be included in the 1.5 release

yangfei4913438 pushed a commit to choice-form/superset that referenced this pull request Apr 2, 2022
@mistercrunch mistercrunch added 🍒 1.4.2 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.5.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels preset:2021.46 preset-io size/L v1.4.2 🍒 1.4.2 🚢 1.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants