Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update azure-storage-blob version #25426

Merged
merged 1 commit into from
Oct 13, 2022
Merged

Update azure-storage-blob version #25426

merged 1 commit into from
Oct 13, 2022

Conversation

eladkal
Copy link
Contributor

@eladkal eladkal commented Jul 31, 2022

We restricted version in #18443 due to error in static checks #17068 (comment)
Tests pass locally with the updated version


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@uranusjr
Copy link
Member

uranusjr commented Aug 1, 2022

Any reason versions between 12.9 and 12.13 need to be excluded?

@eladkal
Copy link
Contributor Author

eladkal commented Aug 2, 2022

Any reason versions between 12.9 and 12.13 need to be excluded?

I'll check if I can relax the version

@eladkal
Copy link
Contributor Author

eladkal commented Aug 3, 2022

mmm We have

airflow/providers/microsoft/azure/hooks/wasb.py:302: error: Argument 1 to
"write" of "BufferedWriter" has incompatible type "Union[bytes, str]"; expected
"Union[bytes, Union[bytearray, memoryview, array[Any], mmap, _CData]]" 

which seems to be related to Azure/azure-sdk-for-python#24661 and will be fixed only in 12.14.0b1 so we can not relax the version

@eladkal
Copy link
Contributor Author

eladkal commented Aug 3, 2022

As for the second static failure it's kinda strange.
mypy fails with:

airflow/providers/microsoft/azure/hooks/wasb.py:232: error: Argument
"delimiter" to "walk_blobs" of "ContainerClient" has incompatible type
"Optional[str]"; expected "str"  [arg-type]
    ...me_starts_with=prefix, include=include, delimiter=delimiter, **kwargs)

Code line:

delimiter: Optional[str] = '/',

but the types did not change between the versions:
https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.8.1/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L779

https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.13.0/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L780

In 12.8.1 (when mypy pass) it’s the same as in 12.13.0 which mypy fails!

@josh-fell
Copy link
Contributor

As for the second static failure it's kinda strange. mypy fails with:

airflow/providers/microsoft/azure/hooks/wasb.py:232: error: Argument
"delimiter" to "walk_blobs" of "ContainerClient" has incompatible type
"Optional[str]"; expected "str"  [arg-type]
    ...me_starts_with=prefix, include=include, delimiter=delimiter, **kwargs)

Code line:

delimiter: Optional[str] = '/',

but the types did not change between the versions: https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.8.1/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L779

https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.13.0/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L780

In 12.8.1 (when mypy pass) it’s the same as in 12.13.0 which mypy fails!

The delimiter typing should really be delimiter: str = '/', though. I think we've been getting away with this in mypy for a while 🙂

@eladkal
Copy link
Contributor Author

eladkal commented Aug 3, 2022

The delimiter typing should really be delimiter: str = '/', though. I think we've been getting away with this in mypy for a while 🙂

Yes this is what i did as it also required param in the upstream lib but I'm not sure I get why mypy behaves differently on the same function signature and typing? a mystery.

@eladkal
Copy link
Contributor Author

eladkal commented Aug 3, 2022

OK delimiter is solved.
So now :

  1. Test failures are not related to this PR. It will be fixed in Limit Flask to <2.3 in the wake of 2.2 breaking our tests #25511 - I'll rebase after it's merged.
  2. write function typing issue has been fixed upstream we must wait to 12.14.0

after that we should get a green build

@potiuk
Copy link
Member

potiuk commented Aug 3, 2022

Merged . Rebasing this one on top :)

@potiuk
Copy link
Member

potiuk commented Aug 4, 2022

Just one static-check left!

@eladkal
Copy link
Contributor Author

eladkal commented Aug 4, 2022

Just one static-check left!

Yes. This is the one that we must wait to 12.14.0
The issue already fixed upstream but they didn't release it yet. I wrote to them about it:
Azure/azure-sdk-for-python#24661 (comment)

Once 12.14.0 I'll modify the PR to use that version as minimum

@eladkal eladkal marked this pull request as ready for review August 5, 2022 07:13
@eladkal
Copy link
Contributor Author

eladkal commented Aug 5, 2022

Still failing with

Run mypy for providers.................................................................Failed
- hook id: run-mypy
- exit code: 1

airflow/providers/microsoft/azure/hooks/wasb.py:305: error: Incompatible types
in assignment (expression has type "Union[bytes, str]", variable has type
"bytes")  [assignment]
                content: bytes = stream.readall()

Lets see if bb1c545 resolves it

@uranusjr
Copy link
Member

uranusjr commented Aug 5, 2022

I think Mypy is too smart and can tell that open(..., "wb") does not accept str, so that would also fail. This is probably the solution with the least overhead

content = stream.readall()
assert isinstance(content, bytes)
fileblob.write(content)

@eladkal
Copy link
Contributor Author

eladkal commented Aug 6, 2022

So it solved the static issue but created problems for the unit tests.
I think the best option here is just to wait for upstream to release 12.14.0 should be in a month+-

@eladkal eladkal marked this pull request as draft August 6, 2022 08:40
@josh-fell
Copy link
Contributor

So it solved the static issue but created problems for the unit tests.

I think the best option here is just to wait for upstream to release 12.14.0 should be in a month+-

Maybe if the assertion was under TYPE_CHECKING scope the unit test wouldn't fail? Just an idea since the assert statement is to hopefully make mypy happy.

@potiuk
Copy link
Member

potiuk commented Aug 6, 2022

Maybe if the assertion was under TYPE_CHECKING scope the unit test wouldn't fail? Just an idea since the assert statement is to hopefully make mypy happy.

I think not really - the idea is to make it consistent, I think :). It's going to change when the fixed version of liubrary is out, so it will change then and someone might rely on it being bytes.

@uranusjr
Copy link
Member

uranusjr commented Aug 8, 2022

Yeah let’s wait for upstream.

@eladkal
Copy link
Contributor Author

eladkal commented Aug 31, 2022

Testing with 12.14.0b2

Run mypy for providers.................................................................Failed
- hook id: run-mypy
- exit code: 1

airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "offset"
to "download_blob" of "BlobClient" has incompatible type "Optional[int]";
expected "int"  [arg-type]
            return blob_client.download_blob(offset=offset, length=length,...
                                                    ^
airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "length"
to "download_blob" of "BlobClient" has incompatible type "Optional[int]";
expected "int"  [arg-type]
    ...turn blob_client.download_blob(offset=offset, length=length, **kwargs)
                                                            ^
Found 2 errors in 1 file (checked 951 source files)

Looks like Azure/azure-sdk-for-python@8c24354 changed also type hints for another function.

@josh-fell
Copy link
Contributor

Testing with 12.14.0b2

Run mypy for providers.................................................................Failed
- hook id: run-mypy
- exit code: 1

airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "offset"
to "download_blob" of "BlobClient" has incompatible type "Optional[int]";
expected "int"  [arg-type]
            return blob_client.download_blob(offset=offset, length=length,...
                                                    ^
airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "length"
to "download_blob" of "BlobClient" has incompatible type "Optional[int]";
expected "int"  [arg-type]
    ...turn blob_client.download_blob(offset=offset, length=length, **kwargs)
                                                            ^
Found 2 errors in 1 file (checked 951 source files)

Looks like Azure/azure-sdk-for-python@8c24354 changed also type hints for another function.

def download_blob(
            self, offset: int = None,
            length: int = None,
...

^ I'm by no means a mypy expert, but the typing for offset and length are incorrect?

@eladkal
Copy link
Contributor Author

eladkal commented Oct 6, 2022

New release of azure-storage-blob is expected next week:
Azure/azure-sdk-for-python#24661 (comment)
then this PR will be green :)

We restricted version in apache#18443
due to error in static checks apache#17068 (comment)
Tests pass locally with the updated version
@eladkal eladkal marked this pull request as ready for review October 13, 2022 10:12
@eladkal
Copy link
Contributor Author

eladkal commented Oct 13, 2022

azure-storage-blob 12.14.0 was released so this PR should be ready

@eladkal eladkal merged commit 59cba36 into apache:main Oct 13, 2022
@eladkal eladkal deleted the azure branch October 13, 2022 11:20
@malthe
Copy link
Contributor

malthe commented Dec 8, 2022

This change broke the wasb log task handler.

The download method now requires offset and length arguments, but these are not provided from the task handler code. Also, how would it provide anything else than "read the whole file".

@potiuk
Copy link
Member

potiuk commented Dec 8, 2022

This change broke the wasb log task handler.

The download method now requires offset and length arguments, but these are not provided from the task handler code. Also, how would it provide anything else than "read the whole file".

I think you shoudl open an issue (or better PR :) for that.

potiuk referenced this pull request Dec 30, 2022
* Make arguments 'offset' and 'length' not required

* Disable implicit optional for azure-storage (mypy)
@pierrejeambrun pierrejeambrun added this to the Airflow 2.5.1 milestone Jan 9, 2023
@pierrejeambrun pierrejeambrun added the type:misc/internal Changelog: Misc changes that should appear in change log label Jan 9, 2023
@pierrejeambrun pierrejeambrun removed this from the Airflow 2.5.1 milestone Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:microsoft-azure Azure-related issues type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants