Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Trino): create PrestoBaseEngineSpec base class to share common code between Trino and Presto #21066

Merged
merged 9 commits into from
Aug 29, 2022

Conversation

dungdm93
Copy link
Contributor

@dungdm93 dungdm93 commented Aug 12, 2022

SUMMARY

#20152 Making the TrinoEngineSpec derive from the PrestoEngineSpec to reusing code. However, underlay driver of Trino is different from Presto - PyHive which lead to some issues. (See discussion in #20729)

This PR is rework on that PR by create PrestoBaseEngineSpec to share common functions between Presto variants.

Fix:

Suppress:

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Unit tests

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dungdm93 dungdm93 changed the title Base presto and trino fix(Trino): create PrestoBaseEngineSpec base class to share common code between Trino and Presto Aug 12, 2022
@dungdm93
Copy link
Contributor Author

cc @villebro, @john-bodley
Could you guys help me listing issues, or features in Presto u like to have in Trino.

@dungdm93 dungdm93 force-pushed the base-presto-and-trino branch 4 times, most recently from 4b86bf3 to 38d6438 Compare August 12, 2022 07:29
@codecov
Copy link

codecov bot commented Aug 12, 2022

Codecov Report

Merging #21066 (dedcacf) into master (5113b01) will decrease coverage by 0.06%.
The diff coverage is 80.00%.

@@            Coverage Diff             @@
##           master   #21066      +/-   ##
==========================================
- Coverage   66.28%   66.21%   -0.07%     
==========================================
  Files        1770     1770              
  Lines       67522    67574      +52     
  Branches     7177     7177              
==========================================
- Hits        44754    44747       -7     
- Misses      20934    20993      +59     
  Partials     1834     1834              
Flag Coverage Δ
hive 53.14% <31.76%> (-0.03%) ⬇️
mysql 80.97% <80.00%> (+0.01%) ⬆️
postgres ?
presto 53.04% <44.70%> (-0.03%) ⬇️
python 81.34% <80.00%> (-0.16%) ⬇️
sqlite ?
unit 50.71% <38.82%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/db_engine_specs/hive.py 87.10% <ø> (+0.23%) ⬆️
superset/db_engine_specs/trino.py 69.02% <43.47%> (-5.74%) ⬇️
superset/db_engine_specs/presto.py 87.95% <93.54%> (-0.03%) ⬇️
superset/sql_validators/postgres.py 50.00% <0.00%> (-50.00%) ⬇️
superset/sql_validators/__init__.py 80.00% <0.00%> (-20.00%) ⬇️
superset/sqllab/sql_json_executer.py 64.63% <0.00%> (-9.76%) ⬇️
superset/sqllab/execution_context_convertor.py 81.48% <0.00%> (-7.41%) ⬇️
superset/db_engine_specs/postgres.py 90.67% <0.00%> (-5.94%) ⬇️
superset/db_engine_specs/sqlite.py 91.89% <0.00%> (-5.41%) ⬇️
superset/reports/commands/log_prune.py 85.71% <0.00%> (-3.58%) ⬇️
... and 31 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@dungdm93 dungdm93 force-pushed the base-presto-and-trino branch 2 times, most recently from 441ab0a to 6c4da3a Compare August 12, 2022 08:18
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Tested to work as expected on Trino with a few comments, LMKWYT.

from sqlalchemy.orm import Session
from sqlalchemy.sql.expression import ColumnClause, Select

from superset import cache_manager, is_feature_enabled
from superset.common.db_query_status import QueryStatus
from superset.databases.utils import make_url_safe
from superset.db_engine_specs.base import BaseEngineSpec, ColumnTypeMapping
from superset.db_engine_specs.base import ColumnTypeMapping
from superset.db_engine_specs.presto_base import PrestoBaseEngineSpec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could just as well keep PrestoBaseEngineSpec in superset.db_engine_specs.presto, like we do for PostgresBaseEngineSpec, as it kinda keeps the list of modules in db_engine_specs shorter. Not strongly opinionated here, but let's at least keep it consistent; if we do decide to break out the bases, then I'd prefer to do it in a follow-up refactor PR and doing the same for other specs, too.

Comment on lines -92 to -117
@classmethod
def get_table_names(
cls,
database: Database,
inspector: Inspector,
schema: Optional[str],
) -> List[str]:
return BaseEngineSpec.get_table_names(
database=database,
inspector=inspector,
schema=schema,
)

@classmethod
def get_view_names(
cls,
database: Database,
inspector: Inspector,
schema: Optional[str],
) -> List[str]:
return BaseEngineSpec.get_view_names(
database=database,
inspector=inspector,
schema=schema,
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also remove has_implicit_cancel, as it's no longer needed (it returns the same as BaseEngineSpec which isn't overridden in PrestoBaseEngineSpec)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_implicit_cancel is removed

@dungdm93 dungdm93 requested review from villebro and removed request for betodealmeida, john-bodley and nytai August 24, 2022 11:39
engine = "trino"
engine_aliases = {"trinonative"} # Required for backwards compatibility.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trinonative in setup.py already removed since #20152

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming CI doesn't turn up any goblins

…s between Presto and Trino

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
@villebro villebro merged commit ccb293a into apache:master Aug 29, 2022
@dungdm93 dungdm93 deleted the base-presto-and-trino branch August 29, 2022 07:23
john-bodley pushed a commit to airbnb/superset-fork that referenced this pull request Sep 2, 2022
…code between Trino and Presto (apache#21066)

* chore: create `PrestoBaseEngineSpec` class that share common functions between Presto and Trino

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

* feat(Trino): support CertificateAuthentication

* chore(Presto): move `get_function_names` to `PrestoBaseEngineSpec`

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

* chores(Presto): remove `is_readonly_query`

* feat(Trino): implement `extra_table_metadata`

* feat(Trino): specify `User-Agent`

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

* fix: pylint

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

* chores(Presto): move `PrestoBaseEngineSpec` to `presto.py`

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

* fix(Presto): typing annotations

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>

Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>
(cherry picked from commit ccb293a)
@john-bodley
Copy link
Member

john-bodley commented Dec 15, 2022

@dungdm93 and @villebro the issue I see with this approach is that it introduced several regressions (from a feature parity perspective; especially given that a migration path from Presto to Trino is somewhat likely—see here for details), i.e., the trino.latest_partition, trino.latest_sub_partition macros no longer work as this functionality is part of the PrestoEngineSpec rather than the PrestoBaseEngineSpec.

Are there plans to either:

  1. Add this functionality to the PrestoBaseEngineSpec?
  2. Port this functionality to the TrinoEngineSpec?

@villebro
Copy link
Member

@john-bodley the problem is that PyHive and the new Trino connector are diverging at a rapid pace right now, so extending the Trino spec from the old Presto spec is IMO not a good idea (that actually created a few regressions of its own when we did that). I'm fine moving everything that's known to work on both to PrestoBaseEngineSpec (I can chip in with adding tests when I have some spare time).

@john-bodley
Copy link
Member

@villebro understood. I'm 50/50 about whether option 1 or 2 is the preferred solution. Option 1 might be easiest for now (conditional on how coupled the logic is with the DB-API), but long term I sense option 2 is likely preferred.

@rusackas rusackas added the v2.0.2 label Feb 2, 2023
@lilykuang lilykuang added the v2.0 label Feb 23, 2023
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants