-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix update-common-sql-api-stubs
pre-commit check
#38915
Conversation
Contex: The main reason for this pre-commit was to detect (and see in PR) when someone modified common.sql interface in a breaking way. It's been implemented experimentally, and I think it's been a failed attempt (in the sense that a) it was broken for a long time b) nobody knew about it. Discussion here: https://lists.apache.org/thread/1by8ko8jrrp1xwxt5781bwn2tokxjodl I think we should find a better way of doing that check and flag if somoene does a potentiall breaking change in common.sql (which should not happen) and also I think we should add a little better policy on bumping all common.sql dependencies to latest minor version - to avoid cases that somoene uses features from latest version but older common.sql is installed - but then I do not really want ot to be again the person who proposes and leads this alone - so maybe one of you would like to take a lead on it? |
One of the problem that if some contributor want to add some feature into the the specific dialect in half cases it add something into the How to prevent breaking changes? In most cases it could be done by make DBApiHook as Interface/Protocol only with bare implementation and move specific implementation into the particular hooks, right now a lot of sits into DBApiHook which might make think that something would work with any of the 24 Hooks but it is far away from the reality. |
Maybe also run provider's DBApiHooks implementations agains non-main version. (New special tests!), when we run tests against min supported and max released version of the provider instead of code from the I'm not sure that we ready for that, but it might help to find this kind of breaking changes. WDYT @potiuk |
Yes. I think we have to be simply careful with those - like in #38715 . The stub was mostly about the .pyi interface changes, but of course behavioural changes are only detectable (currently) by review. cc: @dabla - maybe good to get comments from a recent "fresh" contributor to those.
To be perfectly honest - I think For the context - mostly it was about DML/DDL queries and making sure all the SQL-related operators that are implemented in common.sql work and their logic is not copied in all 24 providers (before making them comon we had several 10s of copies of those operators accross the 24 providers"
But if you think we can make it less vulnerable for coupling problems by minimising implementation of hooks and possibly copying the code to hooks, yes, I think it's a good idea.
Yes I thought about it - but then we have 12 MINOR versions of common.sql and several providers have different >= DIFFERENT version, So it's hard to make a simple and meanigful set of tests there - especially that just "import" is not really enough, we should also run actual tests, and we currently run tests only in main. This is why I proposed >= LAST_MAIN for all providers - then we will be generally forced to fix any incompatibilties in common.sql (which we do anyway BTW). |
The problem here, that we talk about implementation about SQL, and there is no one database which implements all features from the modern SQL Standard (SQL ANSI 92 outdated decades ago). And even if 2 databases implements one part from the standard it might do it differently and even implements partially, because SQL:Standard know as collection of outdated or not applicable in to the real word examples/types/statements. The truth as usual somewhere in between "copy all into the generic" and "keep it all in providers".
I guess, I would like to propose look how it resolved in SQLAlchemy, as a better example. And it resolved on feature on the dialect, if some particular feature doesn't supported into the specific Dialect you get an error, rather than generic implementation. So we might want to get this approach, e.g. assign specific class as class Attribute/Variable and we could check is it supported and how it should support:
class DBApiHook:
...
features: dict[str, Literal["not-implemented", "generic", "implemented"]] # or TypedDict
@final
def _generic_awesome_feature(self, foo=foo, bar=bar):
return self._generic_awesome_feature(foo=foo, bar=bar)
@final
def awesome_feature(self, *, foo, bar) -> SomeAwesomeAnnotation:
if (feature_flag := self.features.get("awesome", "not-implemented")) == "not-implemented":
raise NotImplementedError(f"You {type(self).__name__} not implements/support this awesome feature")
elif feature_flag == "implemented":
return self.provider_awesome_feature(foo=foo, bar=bar)
elif feature_flag != "generic":
raise RuntimeError(f"Unknown option {feature_flag} ")
return self._generic_awesome_feature(foo=foo, bar=bar)
def provider_awesome_feature(self, *, foo, bar) -> SomeAwesomeAnnotation:
raise NotImplementedError("You should really implements this one")
class SomeProviderHook(DBApiHook):
@override # PEP 698
def provider_awesome_feature(self, *, foo, bar):
return foo, foo
In the other hand we have a one copy which might not works correctly because originally it copied from the Postgres or MySQL and some of the stuff won't work correctly into the another or give a hope that it might work.
Small nit 24 Hooks, not providers. Some of them implements multiple hooks within the one provider |
I don't think we disagree here. The attempt so far in the DBAPiHook and common.sql was to make a "common" interface out of "not-that common" DBAPI + SQL - and then make it possible for openlineage integration to make use of those common cases. Initially the assumption was that the "common" part of the API in common.sql relies on the "common" parts of both. Not trying to implement all features and dialects, but implement a subset of simple functionality which should be "same" for both. While we succeed in most cases, we failed in some others (maybe). We should likely identifiy where we failed and either make it common or turn it into the "feature" approach you described. But I am not sure if we should aim to have a complete replacement for all the awesome features here in the DBApiHook - I think it was never a goal to have a "complete DB set of features" there - just very basic ones that should be possible to be common. I have no good solution there how to limit future changes, but I feel that the "awesome fetures" if they are implemented, should be implemented in each hook sepearately, but if we feel that we should add more to the DBApiHook, then - I agree we should have some way you describe. I don't think there is anything to stop us doing what you descrine now - if we want to add somethign "new" to to DBApiHook. |
Maybe we could enforce (dunno if it's possible) that when a Pull Request is created it is always a draft by default, and once review is done and changes/enhancements look okay/accepted to be merged, we first convert the pull request from draft to non-draft and then run those tests would be run on the pull request itself instead of in the main where it's actually to late. Dunno if it's feasible that only revieuwers could be allowed to put a PR out of draft and not the author (unless he's a reviewer ofc). That could be a way to enforce all test to be run before the PR is actually merged? |
It's not about |
(cherry picked from commit 4f169bd)
This pre-commit check doesn't run due to wrong path and seems like it doesn't run for a long time.
I'm not sure that do we need this stub files, since provider drop support of Airflow 2.4 for a long time, and there is no any subclassing resolution required but this more for the future discussion.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.