Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gcs walker relative key for import #5114

Merged
merged 1 commit into from
Jan 29, 2023
Merged

Conversation

nopcoder
Copy link
Contributor

Fix #5104
Fix #5102
Fix #5113

@nopcoder nopcoder added bug Something isn't working area/block-adapter include-changelog PR description should be included in next release changelog labels Jan 26, 2023
@nopcoder nopcoder requested a review from itaiad200 January 26, 2023 16:24
@nopcoder nopcoder self-assigned this Jan 26, 2023
@@ -23,6 +23,10 @@ func NewGCSWalker(client *storage.Client) *gcsWalker {

func (w *gcsWalker) Walk(ctx context.Context, storageURI *url.URL, op WalkOptions, walkFn func(e ObjectStoreEntry) error) error {
prefix := strings.TrimLeft(storageURI.Path, "/")
var basePath string
if idx := strings.LastIndex(prefix, "/"); idx != -1 {
basePath = prefix[:idx+1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we want? I asked to import gs://foo/bar and you'll import gs://foo/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it supports the case of a single file and directory

@@ -51,7 +54,7 @@ func (w *gcsWalker) Walk(ctx context.Context, storageURI *url.URL, op WalkOption
}
if err := walkFn(ObjectStoreEntry{
FullKey: attrs.Name,
RelativeKey: strings.TrimPrefix(attrs.Name, prefix),
RelativeKey: strings.TrimPrefix(attrs.Name, basePath),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay with this being empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty string does nothing to the original value

@nopcoder nopcoder requested a review from itaiad200 January 26, 2023 16:40
Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get why it works for the bugs we have opened. I'm afraid of 2 things:

  1. Regression in some other flow. It doesn't seem like our ingest/import tests cover the slashes bug well enough.
  2. Inconsistency with other block adapters. Minor, but still.

Approving to unblock, please consider addressing these in followup issues.

@nopcoder
Copy link
Contributor Author

Thanks @itaiad200 - tested from the UI and lakectl

  • The above is based on our S3 walker implementation.
  • Want to add test to cover this change.

@nopcoder nopcoder merged commit 3e8f279 into master Jan 29, 2023
@nopcoder nopcoder deleted the fix/gcs-walker-basedir branch January 29, 2023 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/block-adapter bug Something isn't working include-changelog PR description should be included in next release changelog
Projects
None yet
2 participants