Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etag, oc-etag, ctag, sync-token and the future #3782

Open
5 tasks
butonic opened this issue May 12, 2022 · 8 comments
Open
5 tasks

etag, oc-etag, ctag, sync-token and the future #3782

butonic opened this issue May 12, 2022 · 8 comments
Labels
Category:Enhancement Add new functionality

Comments

@butonic
Copy link
Member

butonic commented May 12, 2022

When a reverse proxy changes the content encoding by eg. compressing a plain text stream via gzip, it also changes the etag: Bug 63932 - Content compression breaks contract of ETag

While the desktop client prefers the OC-Etag it has learned to strip -gzip and maybe the W/ prefix from the regular ETag as a fallback: owncloud/client#3946 (comment)

However, https://datatracker.ietf.org/doc/html/rfc7232#section-2.1 states

[...] Likewise, a validator is weak if it is shared by two or more
representations of a given resource at the same time, unless those
representations have identical representation data. For example, if
the origin server sends the same validator for a representation with
a gzip content coding applied as it does for a representation with no
content coding, then that validator is weak. [...]

@janackermann noticed that the owncloud-sdk is not yet prepared for this kind of etag handling. Please link your PR here.
owncloud/owncloud-sdk#1067
owncloud/web#6952

@michaelstingl I wonder if iOS and android handle this somehow.

owncloud

The core issue that lead to OC-ETag is owncloud/core#9005 which explains why we are now using our own OC-ETag header.

I'm still not 100% sure we are using etags correctly. AFAICT we should be using a ctag (content tag) to implement change detection in collections, as google recommends: https://developers.google.com/calendar/caldav/v2/guide

CTag and DAV:sync-token

However, the caldav-ctag-03 RFC has been deprecated in 2015:

IMPORTANT: The feature defined by this specification is now
deprecated in favor of support for the WebDAV Sync REPORT as defined
by RFC6578. Clients MUST NOT rely on this feature to detect
changes to collections, instead they MUST support the WebDAV Sync
REPORT. Servers MUST support the WebDAV Sync REPORT to allow clients
to efficiently synchronize calendar collections. Whilst most modern
clients do support the WebDAV Sync REPORT, servers MAY continue to
support this specification by simply using the DAV:sync-token
property value for the getctag property value, in order to provide
backwards compatibility with old clients.

https://sabre.io/dav/building-a-caldav-client/ shows exactly how a PROPFIND with both cs:getctag and DAV:sync-token would look like, especially what form of URL to expect in the ̀ sync-token`.

MS Graph

ms graph has similar concepts, it just uses json:

Property Type Description
cTag String An eTag for the content of the item. This eTag is not changed if only the metadata is changed. Note: This property is not returned if the item is a folder. Read-only.
eTag String eTag for the entire item (metadata + content). Read-only.

Note: The eTag and cTag properties work differently on containers (folders). The cTag value is modified when content or metadata of any descendant of the folder is changed. The eTag value is only modified when the folder's properties are changed, except for properties that are derived from descendants (like childCount or lastModifiedDateTime).

But similar to the webdav sync is has a /delta endpoint for token based sync: https://docs.microsoft.com/en-us/onedrive/developer/rest-api/api/driveitem_delta?view=odsp-graph-online

Future

Both protocols indicate that having a deticated property to detect recursive changes makes sense. IMO we should

  • introduce a ctag in the CS3 api
  • expose the CS3 ctag as DAV:sync-token in ocdav,
  • deprecate OC-ETag in the clients
  • expose the CS3 ctag as ctag in the graph api
  • implement the /delta endpoint
  • not invest time in the sync-collection for ocdav

Related:

cernbox/smashbox#46

@labkode
Copy link
Contributor

labkode commented May 16, 2022

@micbar
Copy link
Contributor

micbar commented Jul 13, 2022

after GA issue

@tbsbdr
Copy link
Contributor

tbsbdr commented Jan 25, 2023

Discussion with @dragotin @micbar @michaelstingl @felix-schwarz
Discussion status:

  • etag: Always change etag for metadata and content changes
  • Checksum: Every client should (no need!) save a content-checksum and compare it for files
  • m-tag: Introduce m-tag (new) for metadata changes on folders (future)

@dragotin @felix-schwarz please continue the discussion

@butonic
Copy link
Member Author

butonic commented Aug 15, 2023

We will have cTag and eTag on the graph api anyway as described in the MS Graph section above. No need to invent an m-tag. eTag is for the entire item (metadata + content). cTag would be new and would be content only.

@michaelstingl
Copy link
Contributor

eTag is for the entire item (metadata + content).

Perfect! 😻

cTag would be new and would be content only.

@TheOneRing @felix-schwarz do we have a requirement for propagation of content-only changes? I don't think so…

@TheOneRing
Copy link
Member

I don't think so.

@felix-schwarz
Copy link

felix-schwarz commented Aug 16, 2023

Currently, the ETag is used to signal:

  • for files: changes of file contents
  • for folders:
    • changes to folder contents (file added/removed/renamed)
    • changes to file metadata (incl. filename; but excl. other metadata like favorite status) of files contained in the folder

If the ETag of files also changes for metadata changes, old clients get a signal that the file contents changed and to re-download the file (unnecessarily - in the case of metadata-only changes).

Therefore a CTag is essential for clients to be able to put an ETag-change into context - and to distinguish between a mere metadata change and an actual file content change to determine if a file should be re-downloaded.

Client updates will be needed for them to take advantage of the CTag to avoid unnecessary transfers.

The idea for a metadata-only MTag came from trying to find a backward-compatible way to signal metadata changes, where existing clients continue to work as expected - and updated clients can take advantage of metadata change propagation via the MTag.

Regarding other WebDAV-based clients or sync solutions, I have no insight about how they interpret an ETag change. But if they interpret an ETag change as "file contents changed", using the ETag to propagate both metadata and content changes might cause unwanted effects there.

@michaelstingl
Copy link
Contributor

Current file checksum could be used the same way as the cTag. In oC10 world, not all files have a checksum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category:Enhancement Add new functionality
Projects
None yet
Development

No branches or pull requests

7 participants