Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define WebKit's underlying file changes behavior #47

Open
annevk opened this issue Aug 15, 2016 · 14 comments · May be fixed by #154
Open

Define WebKit's underlying file changes behavior #47

annevk opened this issue Aug 15, 2016 · 14 comments · May be fixed by #154

Comments

@annevk
Copy link
Member

annevk commented Aug 15, 2016

See https://bugzilla.mozilla.org/show_bug.cgi?id=660148 for details.

@bakulf
Copy link

bakulf commented May 16, 2018

If the underlying file size changes, we must be sure that we maintain the run-to-completion semantics.
This means that file.size == file.size must always be true.

Plus, if the underlying file changes, maybe also the type changes. Should file.type changes too? The same is for its name.

If we really want to support this feature, I suggest to introduce a 'changed' event and, only when triggered, the DOM File attributes should be updated.

@mkruisselbrink
Copy link
Collaborator

If I understand this issue it is more about better defining what happens if the snapshot state of a blob/file no longer matches the file on disk, rather than trying to actually support files that change. I.e. the spec currently sort of hand-wavily describes that attempting to read such a blob should throw a NotReadableError, but that isn't actually reflected in the reading algorithms.

Not sure if changing attributes of the File/Blob objects themselves makes sense... Unfortunately there are cases in blink were currently we don't really deal with snapshot states correctly, and can have file.size change from under you, but I would consider all of those things bugs.

Not sure if I'm a fan of introducing "changed" events either... I like the conceptual simplicity of a Blob/File object being immutable. Sure, reading might fail because the snapshot state became invalid, but otherwise a blob/file always represents the same data as it did when it was first created.

mkruisselbrink added a commit that referenced this issue May 21, 2020
This redefines how blobs work by adding internal slots and hooks to
support Blob objects backed by various sources of data. By doing so this
tries to make more explicit how various edge cases behave, as well as
making it clear that blobs are immutable.

This fixes #75, fixes #99, and fixes #47.
It also lays the ground work to address w3c/webappsec-clear-site-date#49.

This also fixes #71.
mkruisselbrink added a commit that referenced this issue May 21, 2020
This redefines how blobs work by adding internal slots and hooks to
support Blob objects backed by various sources of data. By doing so this
tries to make more explicit how various edge cases behave, as well as
making it clear that blobs are immutable.

This fixes #75, fixes #99, and fixes #47.
It also lays the ground work to address w3c/webappsec-clear-site-date#49.

This also fixes #71.
@saschanaz
Copy link
Member

saschanaz commented May 19, 2022

Just started looking at this. Per my latest observation:

  1. file.text() on all of Gecko, Blink, and WebKit clips the returned string by the initial file size when the file size changes.
  2. file.text() on Blink and WebKit rejects when the timestamp changes. It does not on Gecko.
  3. fileReader.readAsBinaryString() on Gecko fires an error event when the file size changes. It does not care about the timestamp.
  4. fileReader.readAsBinaryString() on Blink and WebKit fires an error when the timestamp changes. If the size changes, it clips the returned string.
  5. file.size never changes in all browsers.
  6. file.lastModified never changes in Gecko and Blink. WebKit seemingly just reports the current time, ignoring the actual data?

@inexorabletash
Copy link
Member

Thanks for the analysis!

What are your thoughts on aligning with the most common behavior, where there's divergence? (Which I think happens to be Blink's behavior, but I swear that's coincidence not bias!)

@saschanaz
Copy link
Member

Yeah, the Blink behavior makes the most sense to me too. The Gecko way doesn't work when the content changes while keeping the size. The Blink/WebKit way still allows modification when the timestamp doesn't change, but I guess we can safely assume that such files are actively pretending as if no change happened.

(When the timestamp changes and then rolls back to the initial timestamp, Blink/WebKit reject and then allow reading it again. Same for Gecko in terms of file size. An error never means it'll fail forever in all browsers.)

@annevk
Copy link
Member Author

annevk commented May 20, 2022

Thanks @saschanaz for looking at this!

What does "clips" end up meaning? What if the new size is smaller? (And also, is the string clipped or are the bytes clipped before decoding?)

Also, this means that as long as an application has a reference to a file, they get to observe changes to it, right? But coupled with the fact that the timestamp has to remain the same I guess the tracking potential for that is rather limited in practice. If normally the timestamp would change I'd actually prefer that we irrevocably lose access on changes, but perhaps that's hard to implement?

@saschanaz
Copy link
Member

saschanaz commented May 20, 2022

Ah sorry, clipping does not happen on non-WebKit engines, one of my manual testcase was using file.slice() and that was causing it. That said, not sure how to write WPT for this 🤔 (or even mochitest)

@annevk
Copy link
Member Author

annevk commented May 20, 2022

For WPT I would commit the manual test (with inline instructions) and file an issue with regards to what you need for automation of it. And discuss that with jgraham. If we ever get testing infrastructure that will also benefit other storage APIs, but it will be some work. (For mochitest maybe ask Jan Varga or asuth?)

@saschanaz
Copy link
Member

If normally the timestamp would change I'd actually prefer that we irrevocably lose access on changes, but perhaps that's hard to implement?

I think the change is only observed when reading the file, and that means if the timestamp changes and then rolls back before reading then the browser won't notice whether such change even happened. I'd say it would be inconsistent to let changes happen only in some situations.

@annevk
Copy link
Member Author

annevk commented May 20, 2022

Thanks for following up on that point. I wasn't sure whether there was other metadata we could look at that might more clearly indicate a change. Following Chromium and WebKit makes sense then to me and I suppose we should codify it as such in the specification as well?

@saschanaz
Copy link
Member

Yeah, I'll write a test first and report again.

@saschanaz
Copy link
Member

So, per web-platform-tests/wpt#34143 in addition to #47 (comment):

  • WebKit is the only engine that clips the content
  • WebKit throws NotFoundError while Blink throws NotReadableError. I think the latter makes more sense? Gecko also throws NotReadableError but for the size change instead.

BTW, please double check the WebKit behavior if anyone here has an Apple device since I'm testing it with Gnome Web on WSL.

@inexorabletash
Copy link
Member

I was assuming "clips" means that if the new size is larger, the previous (smaller) size is used, so the results appear clipped; i.e. the byte size of the result matches the initial file.size seen. So that's not the case in non-WebKit?

@annevk's questions still need answers:

  1. What if the new size is smaller? i.e. does the result have the smaller size, or are padding bytes seen?
  2. Is the string clipped or are the bytes clipped before decoding? i.e. for text() and friends, if the string ends with an incomplete UTF-8 sequence, do we see U+FFFD at the end? I'm going to guess yes, but we should test.

@saschanaz
Copy link
Member

I was assuming "clips" means that if the new size is larger, the previous (smaller) size is used, so the results appear clipped; i.e. the byte size of the result matches the initial file.size seen. So that's not the case in non-WebKit?

You assumed right, and only WebKit does that. (I previously said all browsers did that, but I was deceived by a wrong testcase that uses file.slice(), sorry.)

What if the new size is smaller? i.e. does the result have the smaller size, or are padding bytes seen?

WebKit returns smaller sized results (as seen in assert_less_than_equal). I wonder whether the padding bytes would break anything.

Is the string clipped or are the bytes clipped before decoding? i.e. for text() and friends, if the string ends with an incomplete UTF-8 sequence, do we see U+FFFD at the end? I'm going to guess yes, but we should test.

Bytes, and yes, U+FFFD per the "replacement" behavior of decoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants