-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch: CORB #81
Comments
I think that sounds right, but damn, that explainer is amazingly obtuse. It fails to explain spectacularly. It says why, then the benefits of it, but you have to read a lot of a very large document to understand a) what principles it applies and b) how it applies them. |
I'm trying to put my finger on what makes me uncomfortable about this, so forgive the fact that this is going to ramble a bit. As I understand it, the idea with this proposal is to reject responses which potentially contain sensitive data in the parent process (where the content type is resolved) rather than in the child process, where it would be exposed to spectre-type attacks. Assuming that's correct, it seems like we are defining a Web platform feature that is really tied to specific process architecture, which in turn is designed to match a specific processor architecture. So, I think we ought to ask whether that's really a good thing to bake into the Web architecture, |
Being stricter on Content-Type, even if just for "no-cors", seems like a net positive. And in general being stricter with "no-cors" seems like a positive, since that's often a source of vulnerabilities that apply regardless of architecture. I also wonder if different architectures might not benefit as well from being able to terminate the connection/response early on. The other thing I would take into consideration here is that we're considering expanding Origin or adding From-Origin (see #80) for similar reasons and CORB would obviate the need for specifying those to some extent, thereby protecting more by default. |
This seems right, so my preliminary view of the general CORB concept is positive. However, I'm worried about the details. In particular, I'm worried about using mimesniff for a purpose that it wasn't designed for. It was not designed to confirm types all the formats that it can second guess resources as. Worse, the explainer suggests unfamiliarity with these details. In particular, the explainer says: "Confirming an XML content-type via sniffing is more straightforward than JSON or HTML: XML is signified by the pattern <?xml, possibly preceded by whitespace." However, this is incorrect. It is both too broad and too narrow. Any resource where "<?xml" is preceded by whitespace fails to parse as XML. More importantly, XML doesn't require <?xml, so CORB would fail to protect XML that doesn't have an XML declaration. (It's unclear to me if the letting <?xml be preceded by whitespace in mimesniff makes sense in the light of the purpose that mimesniff was designed for or if it's a plain spec bug in mimesniff.) It's a bit unclear to me why the whole thing isn't reversed. Why is CORB focused on finding e.g. markup traits in loads initiated for images instead of finding image format magic numbers? Having removed previous confusing packet boundary-dependent behavior from Gecko, I'm especially not a fan of "CORB MAY limit sniffing to the first few network packets." So, I'd like to see a proper spec for this and I'd this not to depend on packet boundaries or timing, i.e. I'd want all browsers to look at a deterministic number of bytes, which is easier to accept when looking for the magic numbers of the formats typically mislabeled than when trying to confirm whether text-based formats are correctly labeled) |
The Spectre defense is one part of it, but CORB has some benefits regardless of process architecture, as @annevk mentions. For example, it can also help prevent things like XSSI (and similar) attacks where a page tries to poke at a sensitive response to find out what's in it. @arturjanc gave some examples in whatwg/fetch#687 (comment).
To clarify, CORB doesn't use traditional mimesniff logic for its confirmation sniffing, since you're right that it's not designed for that purpose. For example, things like <!-- can be at the start of JavaScript files, so we can't use it to confirm a text/html response is actually HTML.
We're happy to refine the details here to fix issues.
Unfortunately, there isn't really an equivalent to magic numbers for JavaScript responses. JavaScript files can be requested from any origin, and it's quite hard to confirm whether a response contains JavaScript by sniffing it. Instead, we went with blocking confirmable content types that aren't allowed cross-origin (and which were more likely to contain sensitive info): HTML, XML, JSON.
I think specifying confirmation sniffing probably makes sense, and we're happy to discuss the tradeoffs of how much to sniff. I think our implementation does use a deterministic kMaxBytesToSniff=1024 regardless of packet boundaries, if that's what you mean. @nick-chromium knows the details. (Just for context, it's theoretically possible to do CORB without confirmation sniffing and only protect responses that are labeled as nosniff or From-Origin, but our goal was to protect as many current resources as possible. We did find that confirmation sniffing was necessary for protecting responses without nosniff: almost 20% of responses would be blocked without it, thanks largely to JavaScript responses labeled as text/html.) |
Thanks @csreis! We've had some more discussion about this internally and arrived on Any implementation will be tracked by https://bugzilla.mozilla.org/show_bug.cgi?id=1459357. FWIW, the initial set of changes to Fetch does not inspect the response body and is therefore a bit more straightforward: whatwg/fetch#686. Any future change that does need to inspect the response body will be done in such a way that it's fully deterministic. |
These days I think "CORB++" is a favorable alternative if we can proof it out: |
As an update, I still stand by my prior comment. https://github.com/annevk/orb is what I think Firefox should pursue prototyping as it's safelist- rather than blocklist-based (i.e., what hsivonen suggested above). So at this point I think CORB is Does that seem reasonable? |
Note that I have added this as Proposal as it ought to be removed from Fetch due to its single implementer status. Closes #81.
Note that I have added this as Proposal as it ought to be removed from Fetch due to its single implementer status. Closes #81.
Just curious how you plan to do step 14 of https://github.com/annevk/orb:
I mentioned that confirming something is JavaScript is a bit tricky in #81 (comment), and that's one of the main reasons we went the other way.
If there's a good way to solve these, then ORB / CORB++ seems useful for protecting more types by default. |
@csreis the idea is to accept the performance hit for responses that do not specify a MIME type (and are opaque) and parse them, but in an oracle process rather than a privileged process. |
CORB (cross-origin resource blocking) is a proposal by Google to disallow more responses to cross-origin "no-cors" requests: whatwg/fetch#681.
Given Spectre and issues with "no-cors" in general this seems worth doing to me, but I'd like to double check.
If we want to mark these smaller proposals, I'd suggest
worth-prototyping
.cc @ckerschb
The text was updated successfully, but these errors were encountered: