Skip to content

Comments

Return destination GCS URIs from ADLSToGCSOperator#61463

Open
Abhishekmishra2808 wants to merge 4 commits intoapache:mainfrom
Abhishekmishra2808:fix/adls-to-gcs-return-destination-uris
Open

Return destination GCS URIs from ADLSToGCSOperator#61463
Abhishekmishra2808 wants to merge 4 commits intoapache:mainfrom
Abhishekmishra2808:fix/adls-to-gcs-return-destination-uris

Conversation

@Abhishekmishra2808
Copy link
Contributor

Related: #11323

  • Return value: execute() now returns a list[str] of destination GCS URIs (gs://bucket/object) for uploaded files.
  • The return type remains list[str]; only the returned values are updated to reflect the destination in GCS.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Feb 4, 2026
Copy link
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. But this would benefit from some more polish.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like dest_gcs_bucket, dest_gcs_prefix = _parse_gcs_url(self.dest_gcs) will return the same value regardless of the iteration as self.dest_gcs is constant. Any reason why it can't be moved outside the loop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using posixpath.join(dest_gcs_prefix, obj) as the behavior of os.path.join is OS dependent. On windows, it can result in the \\ separator being used during construction, resulting in an invalid path. Now, it won't cause issues in 99% of cases but for the odd contributor running this code locally on their windows machine, os.path.join can definitely cause issues. This is more of a nit though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for the nit: I assumed you meant PosixPath from Pathlib :)
The odd contributor won't be odd if #10388 gets resolved (not that I think it will happen anytime soon)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for the nit: I assumed you meant PosixPath from Pathlib :) The odd contributor won't be odd if #10388 gets resolved (not that I think it will happen anytime soon)

Yes. My point is that path construction should be OS agnostic. If the function you mentioned is the acceptable one, then that is the one the author should use. Thank you for calling that out.

Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve comment made by @SameerMesiah97 (great review!),
and I think that it will be good to go.
I recall that we should run the system tests before merging this PR to ensure that they don't break, please let us know if you're able to do it - otherwise Google team or someone else will have to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for the nit: I assumed you meant PosixPath from Pathlib :)
The odd contributor won't be odd if #10388 gets resolved (not that I think it will happen anytime soon)

@Abhishekmishra2808 Abhishekmishra2808 force-pushed the fix/adls-to-gcs-return-destination-uris branch from a6488b4 to 3583457 Compare February 6, 2026 13:28
@Abhishekmishra2808 Abhishekmishra2808 force-pushed the fix/adls-to-gcs-return-destination-uris branch from 3583457 to aef396e Compare February 6, 2026 13:31
@Abhishekmishra2808
Copy link
Contributor Author

@shahar1 @SameerMesiah97 Could you please check now, and tell if something is missing on my end.

Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
We need to run the system tests to ensure that nothing breaks (+attach screenshots).

@yuseok89 - if you could help with it, it would be great.

CC: @VladaZakharova @MaksYermak

@yuseok89
Copy link
Contributor

@shahar1
While running the system tests, it looks like this should be using ADLS Gen2 rather than the older Gen1 path.
If I’m mistaken or missing context here, please correct me.

Also, it seems this change may already be reflected in main, so my guess is that resolving the conflict (merge main) and re-running the system tests could be the next step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants