-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use NOT EXISTS
subquery instead of tuple_not_in_condition
#33527
Conversation
f752ec0
to
5fbe461
Compare
airflow/utils/sqlalchemy.py
Outdated
:meta private: | ||
""" | ||
if settings.engine.dialect.name != "mssql": | ||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe better remove this function 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to deprecate it, this has :meta private:
so no user code should depend on it. I feel it’s fine to keep this though, it’s a mirror to tuple_in_condition
and might be useful at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we also could get rid of tuple_in_condition
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I would remove deprecation note
a6767ae
to
f403d43
Compare
if num_to_keep <= 0: | ||
return | ||
|
||
from airflow.models.dagrun import DagRun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think this actually matters, at this point it’s likely DagRun is imported anyway and the import would be basically free. But either way works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side comment. I've learned recently to use more and more local imports. I think basically that when not really needed, the advice of "use top level imports by default" is less and less relevant for "heavy" imports.
The lazy import PEP is years away and it does not seem that it will be enabled by default - you will have to specify an option when you launch interpreter, and I think it's simply inevitable that people will use more and more local imports and the tooling (static checks IDE support etc.) will start supporting it better.
That's my bet actually that rather than switching to lazy impors we will switch to "local imports for heavy things by default". You can see it all over Airflow's code already.
The only reason why you would like to use top-level imports for "heavy" packages is when you use the same import in multiple methods in the same module - but this also can be solved by simply local importing that partiular module.
I bet some tooling for "detect heavy top-level imports" will soon show up in one of the static check tooling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and of course the problem is with type annotations and TYPE_CHECKING - would also be nice to solve that one as the TYPE_CHECKING is really a hack as opposed to local imports)
Some of the recent sqlalchemy changes are not working with minimum version of sqlalchemy of ours - for example `where` syntax does not allow moe than one clause and we are already passing more in _do_delete_old_records (added in apache#33527). This syntax however was added in SQL Alchemy 1.4.28 and our minimum version was 1.4.27. This change bumps the minimum SQLAlchemy version to 1.4.28 but it also adds a special test job that only runs on Postgres that downgrades the SQLAlchemy to the minimum supported version (retrieved from setup.cfg). This way, we will be able to detect such incompatible changes at the PR time. This is a new flag `--downgrade-sqlalchemy` on test command that works similar to earlier `--upgrade-boto`. We also enable the `--upgrade-boto` and `--downgrade-sqlalchemy` flags to be used for `breeze shell` command - thanks to that we can easily test both flags with `breeze shell` command.
…4293) Some of the recent sqlalchemy changes are not working with minimum version of sqlalchemy of ours - for example `where` syntax does not allow moe than one clause and we are already passing more in _do_delete_old_records (added in #33527). This syntax however was added in SQL Alchemy 1.4.28 and our minimum version was 1.4.27. This change bumps the minimum SQLAlchemy version to 1.4.28 but it also adds a special test job that only runs on Postgres that downgrades the SQLAlchemy to the minimum supported version (retrieved from setup.cfg). This way, we will be able to detect such incompatible changes at the PR time. This is a new flag `--downgrade-sqlalchemy` on test command that works similar to earlier `--upgrade-boto`. We also enable the `--upgrade-boto` and `--downgrade-sqlalchemy` flags to be used for `breeze shell` command - thanks to that we can easily test both flags with `breeze shell` command.
…4293) Some of the recent sqlalchemy changes are not working with minimum version of sqlalchemy of ours - for example `where` syntax does not allow moe than one clause and we are already passing more in _do_delete_old_records (added in #33527). This syntax however was added in SQL Alchemy 1.4.28 and our minimum version was 1.4.27. This change bumps the minimum SQLAlchemy version to 1.4.28 but it also adds a special test job that only runs on Postgres that downgrades the SQLAlchemy to the minimum supported version (retrieved from setup.cfg). This way, we will be able to detect such incompatible changes at the PR time. This is a new flag `--downgrade-sqlalchemy` on test command that works similar to earlier `--upgrade-boto`. We also enable the `--upgrade-boto` and `--downgrade-sqlalchemy` flags to be used for `breeze shell` command - thanks to that we can easily test both flags with `breeze shell` command. (cherry picked from commit efbead9)
Replace tuple-not-in-collection by pure NOT EXISTS statement, which should be supported by all modern RDBMS.
This should improve performance for such of queries in MSSQL, and I do not expect any noticeable changes for other backend
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.