-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[DRAFT] Support trino operator #46836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
ea9f3b9 to
21206a6
Compare
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should not replace the old ways. Maybe you can add new ways and say create "NativeTrinoOperator" to do things differently. There is a good reason why we have the common sql abstraction and we want all databases to have more or less the same behviour and this is the way we want to promote it with our users: this allows them to easily switch between different datbases - mostly by changing connections.
Happy to review it when it is changed to be "addition".
There was a dedicated TrinoOperator, which was deprecated in favor of SQLExecuteQueryOperator (Issue #25259). However, SQLExecuteQueryOperator does not fully support Trino’s client protocol. Trino requires the client to fetch results before finalizing query execution. Without this, Trino cancels the query with a USER_CANCELED error, even when data modifications have been applied. This leads to unreliable data modification handling. I implemented TrinoOperator as a subclass of SQLExecuteQueryOperator since the required change is minimal, it ensures results are fetched. This approach is consistent with how other databases with unique behaviors are handled in Airflow. For example, TeradataOperator and SnowflakeOperator also inherit from SQLExecuteQueryOperator while implementing their own specifics. |
|
I think there is a much simpler solution. The condition here needs to be corrected, to pass the handler, even if the |
|
Yes. I would rather avoid creating specific operators if it can be done using built-in common.sql operators. Note there are many more operators in common.sql that provide much richer functionality and you would have to subclass them alll to have similar usage - also things like GenericTransfer work by jus providing two connection_ids that create hooks and they are not using "specific" operators - so making surethe "generic" interface handles also Trino in the way that is consistent with others is way better approach. Snowflake and Teradata operators should not be there, really I think it's an overlook. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
A dedicated TrinoOperator that directly handles Trino's client protocol.
Instead of relying on XComs as relevant for different SQL databases & engines, TrinoOperator fetches all results in default, allowing developers to decide how to use them.
Trino client protocol [https://trino.io/docs/current/client/client-protocol.html#client-protocol]
closes: #46808
related: #46808