Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Copy object mtime #8291

Merged
merged 2 commits into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pkg/api/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -352,7 +352,6 @@ func (c *Controller) CompletePresignMultipartUpload(w http.ResponseWriter, r *ht
return
}

writeTime := time.Now()
physicalAddress, addressType := normalizePhysicalAddress(repo.StorageNamespace, body.PhysicalAddress)
if addressType != catalog.AddressTypeRelative {
writeError(w, r, http.StatusBadRequest, "physical address must be relative to the storage namespace")
Expand Down Expand Up @@ -383,6 +382,7 @@ func (c *Controller) CompletePresignMultipartUpload(w http.ResponseWriter, r *ht
return
}

writeTime := time.Now()
checksum := httputil.StripQuotesAndSpaces(mpuResp.ETag)
entryBuilder := catalog.NewDBEntryBuilder().
CommonLevel(false).
Expand Down
6 changes: 5 additions & 1 deletion pkg/catalog/catalog.go
Original file line number Diff line number Diff line change
Expand Up @@ -2715,7 +2715,6 @@ func (c *Catalog) CopyEntry(ctx context.Context, srcRepository, srcRef, srcPath,

// copy data to a new physical address
dstEntry := *srcEntry
dstEntry.CreationDate = time.Now()
dstEntry.Path = destPath
dstEntry.AddressType = AddressTypeRelative
dstEntry.PhysicalAddress = c.PathProvider.NewPath()
Expand All @@ -2741,6 +2740,11 @@ func (c *Catalog) CopyEntry(ctx context.Context, srcRepository, srcRef, srcPath,
return nil, err
}

// Update creation date only after actual copy!!!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Update creation date only after actual copy!!!
// Update creation date only after actual copy to prevent potential discrepancy between lakefs and the underlying blockstore mtimes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incorrect, it does not prevent any discrepancy

Copy link
Contributor

@yonipeleg33 yonipeleg33 Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, this is kind of what I understood from reading the issue description. Maybe "prevent" isn't accurate, as this fix aims to minimise the time difference rather than making it equal, but maybe I got it all wrong 😄
In any case, my point is to add a why to this comment, not only what.

// The actual file upload can take a while and depend on many factors so we would like
// The mtime (creationDate) in lakeFS to be as close as possible to the mtime in the underlying storage
dstEntry.CreationDate = time.Now()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder - shouldn't we use the blockstore's mtime instead of time.Now? Is this even possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this fix. Your suggestion is a valid question which has implications that we are currently evaluating since it requires another call to the underlying storage


// create entry for the final copy
err = c.CreateEntry(ctx, destRepository, destBranch, dstEntry, opts...)
if err != nil {
Expand Down
Loading