-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: GCSIO delete_batch
breaks dataflow pipeline in 2.53
#30166
Comments
Thanks for the report and investigation. Yes this is because Beam 2.53.0 migrated to use google cloud's official GCS client (before that it was using apitool's generated client which was long deprecated) CC: @shunping |
I checked the code in Beam 2.52 and 2.53. Here is what I found out.
In other words, both function signatures haven't been modified. I think the way it worked before is not the right way to call |
From the API, the correct way to call blobs = client.list_prefix(self.resultio.path)
client.delete_batch(list(blobs.keys(()) |
That's interesting. But I would imagine feeding |
Ok. SGTM |
Great, thanks for the fast fix! |
What happened?
I am upgrading Beam from version 2.52 to 2.53 but it looks like I am running into an issue introduced in #25676.
When trying to call
delete_batch
insidebeam.Map
something like this:I am seeing the following error with Beam 2.53.0 while Beam 2.52.0 works as expected:
I am running the pipeline on Dataflow using Python 3.11.
Beam 2.52.0 used to chunk the paths using
itertools.islice
which avoids having to explicitly use slicing:beam/sdks/python/apache_beam/io/gcp/gcsio.py
Line 304 in 7c8a997
@BjornPrime I guess something similar here would fix the above issue, or am I missing something?
Potentially the current code might already work in Python 3.12 since slices will be hashable, but as far as I know Beam doesn't yet provide Python 3.12 containers.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: