Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve modules import in AWS probvider by move some of them into a type-checking block #33780

Merged
merged 3 commits into from
Aug 28, 2023

Conversation

hussein-awala
Copy link
Member


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Comment on lines +28 to +29
if TYPE_CHECKING:
from datetime import datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is any beneficial to put datetime from stdlib under type-checking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but I will add a pre-commit to check if the imported methods/classes are used in runtime or just for type checking, so if we want to keep it we will need to add an ignore annotation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be very cautions with this pre-commit validations, because it need to handle a lot of stuff the nice example is pydantic which use annotations in runtime , and put them under if TYPE_CHECKING block might lead serious problem

For example lets use example from the main page of pydantic documentation

This is work fine

from datetime import datetime
from typing import Tuple

from pydantic import BaseModel


class Delivery(BaseModel):
    timestamp: datetime
    dimensions: Tuple[int, int]


m = Delivery(timestamp='2020-01-02T03:04:05Z', dimensions=['10', '20'])

and this one would fail on model compilation

from typing import Tuple, TYPE_CHECKING

from pydantic import BaseModel

if TYPE_CHECKING:
    from datetime import datetime


class Delivery(BaseModel):
    timestamp: datetime
    dimensions: Tuple[int, int]
/Users/taragolis/.pyenv/versions/3.9.10/envs/narrative/bin/python /Users/taragolis/Library/Application Support/JetBrains/PyCharm2023.2/scratches/pydantic_sample.py 
Traceback (most recent call last):
  File "/Users/taragolis/Library/Application Support/JetBrains/PyCharm2023.2/scratches/pydantic_sample.py", line 9, in <module>
    class Delivery(BaseModel):
  File "/Users/taragolis/Library/Application Support/JetBrains/PyCharm2023.2/scratches/pydantic_sample.py", line 10, in Delivery
    timestamp: datetime
NameError: name 'datetime' is not defined

Process finished with exit code 1

Copy link
Contributor

@Taragolis Taragolis Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather create pre-commit around deny-list, rather that around all stuff.
Just because if TYPE_CHECKING it is a hack and put everything under this block might add more pain rather than solve it, especially it valid in the project with 100k+ lines of code

Copy link
Member

@uranusjr uranusjr Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above only fails because you didn’t provide from __future__ import annotations. You probably meant to illustrate something like this instead:

from __future__ import annotations
from typing import Tuple, TYPE_CHECKING
from pydantic import BaseModel

if TYPE_CHECKING:
    from datetime import datetime

class Delivery(BaseModel):
    timestamp: datetime
    dimensions: Tuple[int, int]

Delivery(timestamp="2023-08-28", dimentions=(2, 3))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    Delivery(timestamp="2023-08-28", dimentions=(2, 3))
  File "pydantic/main.py", line 404, in __init__
    values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
  File "pydantic/main.py", line 1040, in validate_model
    v_, errors_ = field.validate(value, values, loc=field.alias, cls=cls_)
  File "pydantic/fields.py", line 699, in validate
    raise ConfigError(
pydantic.errors.ConfigError: field "timestamp" not yet prepared so type is still a ForwardRef, you might need to call Delivery.update_forward_refs().

Comment on lines +33 to +34
if TYPE_CHECKING:
from unittest.mock import MagicMock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same valid here from unittest import mock should also load Magic as well as @mock.patch.object(SageMakerHook, "start_pipeline")

@hussein-awala
Copy link
Member Author

I would be very cautions with this pre-commit validations, because it need to handle a lot of stuff the nice example is pydantic which use annotations in runtime , and put them under if TYPE_CHECKING block might lead serious problem

@Taragolis Recently ruff community implemented flake8-type-checking in ruff, and I would say that it cover all your concerns.

There is a config to define the models we want to exempt from being moved into type-checking blocks. For x, this is commonly used:

[tool.ruff.flake8-type-checking]
runtime-evaluated-base-classes = ["pydantic.BaseModel", "sqlalchemy.orm.DeclarativeBase"]

I will open a PR to add the pre-commit hook soon, and I will mention you to discuss its configuration.

For this PR (and the others), I will re-check if we have a similar case, and there is an option to keep standard modules without moving if it blocks merging them, but I don't think we need to exclude them.

@Taragolis
Copy link
Contributor

Recently ruff community implemented flake8-type-checking in ruff, and I would say that it cover all your concerns.

I hope it become better, previously it required a lot of additional time to fix it behaviour.

I we want to enable it into the future and the rules auto-fixable, might be better enable TCH001 and TCH002 (list of all rules), and run against airflow repo and create PR, rather than trying to do it manually?

Personally I don't think that we need to have TCH003 (Move standard library import {} into a type-checking block)

@potiuk
Copy link
Member

potiuk commented Aug 27, 2023

Recently ruff community implemented flake8-type-checking in ruff, and I would say that it cover all your concerns.

I hope it become better, previously it required a lot of additional time to fix it behaviour.

I we want to enable it into the future and the rules auto-fixable, might be better enable TCH001 and TCH002 (list of all rules), and run against airflow repo and create PR, rather than trying to do it manually?

Personally I don't think that we need to have TCH003 (Move standard library import {} into a type-checking block)

Yes. Ruff rules are great for that. Yes. Agree we do not need TCH003.

I think for cherry-pickability reasons we shoudl connect both - authomatic conversion and splitting the "core" changes at least to smalle PRs manually - see #33755. It makes it really hard for the release manager to cherry-pick a change that has conflicts when you modify a file that has been earlier modified by a PR that changes 180 files in core. Basically it means that the release manager will either have to manually resolve conflict, or will have to cherry-pick the refactor that made the conflict happen. The former is always prone to errors, the latter is only possible if you have 80 localized PRs rather than 1 huge PR changing it.

Comment on lines +33 to +34
if TYPE_CHECKING:
from unittest.mock import MagicMock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is honestly somewhat awkward, we could just write mock.MagicMock without any additional imports…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the root issue why this import even exists that PyCharm (and maybe others IDE) can't correctly use annotation from @mock.patch and @mock.patch.object, so you need to set in explicitly if you want to use autosuggestion/autocompletion from the IDE.

I found for myself that use fixtures do not required to manually annotate mock.patch for IDE

@pytest.fixture
def awesome_fixture():
    with mock.patch("foo.bar.spam.Egg") as m:
        yield m

@hussein-awala hussein-awala merged commit 667ab8c into apache:main Aug 28, 2023
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants