-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blob URL store partitioning #153
Comments
Limiting it like that doesn't seem too crazy to me, although I'm not entirely sure I understand the attack/threat model that defends against. A blob URL can only be resolved when the page that created it is still alive, and when the page trying to fetch it is same origin with the original page. If that is the case, you might as well just BroadcastChannel to talk to the original page, rather than jumping through hoops with blob URLs? Of course if storage partitioning blocks BroadcastChannel because of different top-level URLs things would be different, but in that case wouldn't it make more sense to partition blob URLs the same way as other communication mechanisms, rather than having them be partitioned even more? |
Having said all that, for MediaSource blob URLs chrome already limits them to their agent cluster. And scoping all blob URLs to their agent cluster certainly would make the implementation a whole lot simpler (well, the top-level navigation case would still be tricky). So as an implementer agent cluster seems reasonable, as a spec editor I question if that is the right scoping level. |
The keying for storage seems like it will end up being dynamic (i.e., storage access). I think if we can avoid things being dynamically keyed that's preferable. To expand on that, if the key were to change it would mean old blob URLs in that environment would no longer be accessible, which would be quite weird, especially since blob URLs also serialize some origins into their identifier. And also, it seems that if two browsing context groups needed to share a blob they could/would share the object instead of the URL. |
It would be a bit odd inconsistency if sharing objects worked, but sharing blob URLs didn't. |
That doesn't seem that different from today? Blob objects can be shared across origins, while blob URLs can only be resolved by same origin originators. As such there are already plenty of cases where sharing a blob URL doesn't work but sharing a blob, and having the receiver create a blob URL from it does work. |
That is somewhat different inconsistency tough, since it is about cross-origin. |
It seems that blob objects could always be widely shared and that does not change. They essentially have no scope. Blob URLs had a scope that was undefined to some extent (see #135). This proposes to scope blob URLs more clearly, to the agent cluster they are created in as well as any new top-level browsing contexts (which we'd ideally force noopener on). |
I don't understand why we would do this instead of just making blob URLs only loadable by contexts with the same StorageKey. It would seem better to lean on the partitioning we are pursuing in other APIs instead of introducing a new kind of isolation. |
Apart from the issues discussed above that would not work well for top-level navigations to partitioned |
What are the issues above? I looked at the links but it was not obvious to me. Dynamic keying? Not all browsers are proposing to do that. And I don't see why its problematic for top-level navigations. Can you explain that? If we have a StorageKey for the blob we can propagate that to the context. Just like how we are going to have partitioned service workers if they are registered with a partitioned StorageKey. But even if we wanted to prevent blob URLs from partitioned StorageKeys from being navigated to, I don't see why we need to block a context with the same StorageKey as the blob from loading it as a subresource. |
An upside of limiting Blob URL's to agent clusters is that it limits the potential for browser compat issues related to races for resolving Blob URLs against the Blob URLs being revoked. Right now resolve a Blob URL references the Blob URL Store like it's something that's synchronously accessible across all agent clusters. (I believe this leads to the situation described in #157 where Chrome has to use sync IPC when creating Blobs.) That could alternately be addressed by specifying Blob resolution and Blob URL Store manipulations in a more multi-process-aware way. The previously referenced #157 would likely entail this because of the integration with storage. |
(kind of aside, but Chrome doesn't actually need to use sync IPCs when creating Blobs anymore, at least not for any web-exposed API. We still do because some internal and chrome extension usage of blobs might otherwise have race conditions). Blob URL creation however does indeed need sync IPCs in particular to solve situations like: 1) agent1 registers a blob URL; 2) agent1 postMessages said URL to some other agent2; 3) agent2 tries to resolve the blob URL. With chrome's IPC there is no guarantee that the registration in step 1 arrives in the browser process before the attempt to resolve from step 3 arrives. As you say, if all blob URL registrations were scoped to the agent cluster they are created in this would no longer be an issue, and we could possibly eliminate these as sync IPCs. So with my implementer hat on I think scoping by agent cluster makes a lot of sense. I don't have a good idea how web compatible that would be, but it shouldn't be too hard to collect metrics for that. With my spec editor hat on, I'm not sure what makes more sense; scoping by storage key seems like it would solve all the same issues, while probably being more web compatible. On the other hand it does feel a bit weird to me to use storage key for this as blob URLs don't really seem like anything storage related to me (but then we're also using storage key for things like broadcast channel, so that isn't much of an argument). Also I'd say that anything that drives down usage of blob URLs might be a good thing, so being as restrictive as we can while not breaking too much does seem attractive. Overall I don't feel particularly strongly either way. |
FWIW, if there are other reasons to restrict blob URL loading to same agent, I'm not objecting to that. But lets be clear that's the reason its being done. I just want to avoid conflating agent isolation with storage partitioning because they are conceptually different things. |
Heads up that I'm examining the potential breakage this would cause if implemented in chrome: https://bugs.chromium.org/p/chromium/issues/detail?id=1224926 |
@wanderview agreed, though I would add that agent cluster isolation does obviate the need for partitioning. (To explain the scenario above: if A embeds B and B mints a blob URL, you'd want the user to be able to copy that URL and navigate to it. It's not clear to me how that would work if lookup for blob URLs would use the storage key.) |
Sure, but the first post explaining why you want to do agent clustering isolation uses 3rd party partitioning as its primary and only motivation. An explainer laying out the additional motivations would be useful for clarifying why we are attempting a bigger hammer than needed for 3rd party partitioning.
(Just to answer this, but not suggesting we need to do this if agent cluster isolation is preferred for other reasons, but:) A blob URL has a uuid in it, correct? That can be used to look up the storage key associated with the blob and apply it to the context created by navigation. This seems just like how a browser has to use the uuid to lookup the origin of the blob based on the URL today. |
I think we're all pretty much describing the same thing, just in a different manner. We want to change the "map" we look up blob URLs in to not only be keyed on blob URL/UUID, but also be keyed on storage key or agent cluster. However for navigations we still need to be able to look up an entry in this map while ignoring the second part of the key, i.e. we'll still need the existing blob URL -> blob map as well. |
FWIW, we landed metrics in Chrome a while ago to determine potential breakage if we'd partition blob URLs by agent clusters: https://www.chromestatus.com/metrics/feature/timeline/popularity/3963. So about ~0.1% of page loads might be broken if blob URLs are tied to an agent cluster. |
@mkruisselbrink does that exclude top-level navigations to blob URLs? As those could result in a new agent cluster but would nonetheless remain working. |
I believe those accesses are split out here: https://www.chromestatus.com/metrics/feature/timeline/popularity/3964 |
3963 is only for subresources, 3964 would be for top-level navigations where the initiators agent cluster doesn't match the blob URLs agent cluster (not sure if we would block those as well or not) |
Interesting, I don't think we ran into any issues with subresources. We did ran into issues with top-level navigations as we broke those and have yet to try again. (As you note in #153 (comment) those should work.) |
Discussing this with @artines1 again today he pointed out that we hit a problem in Firefox with an A nests (sandboxed) B scenario and B then creating a blob URL and attempting to download it. For some reason that doesn't use the agent cluster of B. He'll look into it a bit more. (If that ends up being a blocker I suppose we might be stuck with the "storage key" here, which is unfortunate given the IPC calls that would be needed still in certain scenarios. What's less of a concern these days is unpartitioning as all browsers seem aligned on always partitioning storage.) |
https://bugs.webkit.org/show_bug.cgi?id=260035 rdar://problem/113705298 Reviewed by Alex Christensen and Sihui Liu. Public blob URLs are only accessible from same-origin dcuments, but access is not restricted by the top-level origin. This means that Blob URLs can be used as a cross-origin tracking mechanism within iframes. In this patch we partition public blob URLs within the Blob Registry by top-level origin. This partitioning is controlled by a feature flag that is disabled by default. I took a few approaches at solving this. The most difficult challenge was finding a solution that allowed retrieving BlobData using a public blob URL from WKWebView APIs. In that case, the relevant top document may not be obvious, or may not exist. As a result, the design of this partitioning is more like access control rather than adding another key into the hashmap. Two alternative designs I considered include creating a second hashmap that is keyed by <URL, SecurityOriginData> and we lookup the BlobData in that map if we have a SecurityOriginData, otherwise we use the unpartitioned map. Or, we create a new map from URL -> SecurityOriginData where we can lookup the associated top origin SecurityOriginData if we don't already know it. However, both of these options are more complex than the chosen implementation, and neither of them seemed safer. This change also enforces a noopener policy on new windows when the top origin of the opener is cross-origin with the blob's security origin. This is a mitigation that was discussed in the blob URL storage partitioning issue [0] with cross-engine support, and that seemed reasonable to me. [0] w3c/FileAPI#153 * LayoutTests/TestExpectations: * LayoutTests/http/tests/local/blob/download-blob-from-iframe-expected.txt: Added. * LayoutTests/http/tests/local/blob/download-blob-from-iframe.html: Added. * LayoutTests/http/tests/local/blob/navigate-blob-expected.txt: Added. * LayoutTests/http/tests/local/blob/navigate-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/broadcast-channel-proxy.html: Added. * LayoutTests/http/tests/local/blob/resources/iframe-creating-or-downloading-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/iframe-for-creating-and-navigating-to-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/main-frame-with-iframe-creating-or-navigating-to-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/main-frame-with-iframe-downloading-blob.html: Added. * LayoutTests/http/tests/security/blob-null-url-location-origin-expected.txt: * LayoutTests/http/tests/security/blob-null-url-location-origin.html: * LayoutTests/http/tests/security/cross-origin-blob-transfer-expected.txt: Added. * LayoutTests/http/tests/security/cross-origin-blob-transfer.html: Added. * LayoutTests/http/tests/security/resources/iframe-cross-origin-blob-transfer.html: Added. * LayoutTests/http/tests/security/top-level-unique-origin2.https.html: * LayoutTests/platform/gtk-wk2/http/tests/local/blob/download-blob-from-iframe-expected.txt: Added. * LayoutTests/platform/mac-wk1/TestExpectations: * Source/WTF/Scripts/Preferences/UnifiedWebPreferences.yaml: * Source/WebCore/fileapi/BlobURL.cpp: (WebCore::BlobURL::isInternalURL): * Source/WebCore/fileapi/BlobURL.h: * Source/WebCore/fileapi/ThreadableBlobRegistry.cpp: (WebCore::ThreadableBlobRegistry::registerInternalFileBlobURL): (WebCore::ThreadableBlobRegistry::registerInternalBlobURL): (WebCore::ThreadableBlobRegistry::registerInternalBlobURLOptionallyFileBacked): (WebCore::ThreadableBlobRegistry::registerInternalBlobURLForSlice): (WebCore::isInternalBlobURL): Deleted. * Source/WebCore/loader/FrameLoader.cpp: (WebCore::FrameLoader::loadURL): (WebCore::FrameLoader::loadPostRequest): (WebCore::createWindow): * Source/WebCore/platform/network/BlobRegistryImpl.cpp: (WebCore::BlobRegistryImpl::registerBlobURLOptionallyFileBacked): (WebCore::BlobRegistryImpl::unregisterBlobURL): (WebCore::BlobRegistryImpl::getBlobDataFromURL const): (WebCore::BlobRegistryImpl::addBlobData): (WebCore::BlobRegistryImpl::registerBlobURLHandle): (WebCore::BlobRegistryImpl::unregisterBlobURLHandle): * Source/WebCore/platform/network/BlobRegistryImpl.h: Canonical link: https://commits.webkit.org/267172@main
It sounds like Firefox has shipped partitioning blob URL fetches by top-level site [1] and frame origin, and WebKit has shipped partitioning blob URL fetches by top-level origin [2] and frame origin. Also, Safari enforces noopener when blob URLs are window.opened and the blob URL origin is cross-origin from the opening page's top-level origin. Chrome is investigating partitioning blob URL fetches by storage key (top-level site, frame origin, ancestor-chain-bit) and enforcing noopener on window.opened blob URLs when the opening page's top-level site is cross-site to the blob URL site. From a spec perspective, would Firefox and Safari be supportive of updating the Blob URL spec to partition blob URL fetches by storage key and enforce noopener on window.opened blob URLs that are at least cross-site? CC @miketaylr [1] https://groups.google.com/a/mozilla.org/g/dev-platform/c/1gt1CVIoffc/m/cFloZuPPAAAJ |
cc @sysrqb |
WebKit is investigating moving toward partitioning by a site-based storage key (either double-keying or triple-keying, TBD), but we can't commit to either scheme at this time. However, we are interested in defining a standard so we can avoid interop (and compat) issues. |
CC @artines1 for a Firefox perspective |
Firefox has introduced an ancestor bit to the partition key implementation, and I believe it applies to Blob URL partitioning as well. Enforcing We are interested in aligning behavior across browser engines to avoid compatibility issues. So, defining a standard for Blob URL partitioning is probably the right approach. |
Thanks for the feedback everyone! I have PRs up to partition Blob URL fetches and revocation by Storage Key:
Regarding enforcing noopener on cross-top-level-site Blob URLs, while reading the spec I learned that there are more cases than just window.open where window.opener can be set in the new window, specifically clicks on 'a' elements, clicks on 'area' elements, or form submissions where |
I have a PR up for the noopener changes as well: Also, on the Chrome side we recently sent out an Intent to Prototype and Ship for these changes: https://groups.google.com/a/chromium.org/g/blink-dev/c/erVBugcYwRc |
Cause noopener to be set for window.open(), clicks on <a> / <area> elements, and form submissions where the corresponding blob URL is cross-site to the top-level site of the context performing the action. This corresponds to the discussion in w3c/FileAPI#153.
See w3c/FileAPI#153 (comment) and w3c/FileAPI#201 for additional context. And https://privacycg.github.io/storage-partitioning/ for the overall partitioning story.
https://privacycg.github.io/storage-partitioning/ has some general background here and https://trac.torproject.org/projects/tor/ticket/15502 is much more specific.
@bakulf was thinking that we could restrict blob URL lookup to the agent cluster (in addition to origin, that is). The one tweak I would suggest to that is that navigating a top-level browsing context (including a noopener one) to a blob URL still ought to work.
Concretely, this would mean that if you have
https://example.com/
open twice, in separate browsing context groups, any blob URLs they mint cannot be used by the other.The one gotcha with the tweak I suggested is that the other could observe existence through a popup then. Now that's an attack that's unlikely to yield anything useful in practice, but we could break that too by forcing noopener or a version of COOP that never matches (and thus always creates a new browsing context group).
We suspect this to be web-compatible and are happy to try it out in Firefox.
cc @mkruisselbrink @hober @SubhamoyS
The text was updated successfully, but these errors were encountered: