-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subresource Integrity integration #48
Comments
If I understand correctly, the hash is independent from the content encoding, right? If so, that's going to be complicated. |
There are two options here: one is to hash the (normalized) source text. Another is to hash the "simple" encoding of BinAST, prior to any compression steps. The latter seems more appropriate in this case. |
I don't think it's appropriate for a transfer-encoding to change the semantics of SRI (e.g. gzip and brotli don't) |
@yoavweiss I'm not sure I understand where there would be a need for semantic changes. Could you elaborate? |
Currently SRI hashes are hashes of the content before gzip/brotli are applied. If AST encoding is just a content encoding, the same principles should apply, and SRI hashes should be calculated before AST encoding and after AST decoding is applied. |
But can't we express this simply as a hash variant, which is already a supported concept in SRI? More formally, a hash value This really feels more like a nit than an actual issue of semantics. |
@yoavweiss Hold on, I think I understand the problem a bit better now. I see where the issue is. The problem is we're multiplexing the URL to serve both BinAST and plainjs files, but we only have one SRI to serve up. |
It seems there isn't a way to slice this salami without introducing a hash specifically for the BinAST code. This would require the referrer page to include two hashes. From a standards perspective this is not a major issue - an extra hint attribute that will be ignored by other browsers. Firefox, when requesting SRI-checked resources, would add the binast mimetype to the accept header when it detected the presence of the second hash, and verify using that. The problem here is that it requires changes on the content provider end - the referrer page must be modified. However - I'd assume that SRI hashes are generated by toolchains these days anyway (as you'd want to recompute them on changes to source). Is that the case? If so whatever process that is should be modifiable to also produce a BinAST hash and include it as well. @yoavweiss What do you think? |
Yeah, that adds a lot of complexity to the developer's flow and forces the page to know if some of its scripts will be binAST encoded, and if so, add two hashes instead of one.
If you have a script that blindly adds SRI hashes, that won't help you if/when the origin gets hacked (which is a major use-case for SRI). Overall, this seems like a discussion that should happen with the SRI folks. /cc @mikewest |
@mozfreddyb, @fmarier, @metromoxie, and @devd are the "SRI folks". :) @otherdaniel might also have thoughts. Also, https://tools.ietf.org/html/draft-thomson-http-mice-03 is relevant. |
My context on 'binary AST' is a bit outdated, but my understanding is:
I suspect this answer won't make Yoav very happy, but I'm having a really hard time imagining a solution where 'binary AST' could be served transparently and with integrity. 'Binary AST' just does a lot more than a mere content encoding could be expected to. |
Variable names are maintained, and we have ideas for making source code comments stripping optional, but yes, that's the general idea.
Ah, well, I was about to suggest that. Out of curiosity, when (and how often) is hash calculated? |
Talking around the office, a colleague observed that what we are trying to do here is in effect comparable to In general I agree with @otherdaniel's assessments. I'm not sure I agree on the "it's not a content encoding" bit. We're running into this issue because we're using hashes to check resource integrity, and hashes are inherently tied to the representation of a particular piece of content. They're convenient because representational equality subsumes all other equivalence class models axiomatically. As you noted, theoretically we could store Philosophical waxing aside, though, I agree it seems we can't slide this through purely transparently on a mime-type basis and still keep SRI support. |
Hold on. You can easily support binary AST with SRI as it is! Example: <script src="https://example.com/example-framework.js"
integrity="sha384-hash-of-normal-JS-file
sha384-hash-of-binary-ast-file"
crossorigin="anonymous"></script> The user agent will notice that there are multiple hashes with the same strength (i.e., sha384), so only one of them has to match. User agents supporting binary AST, will receive a file that matches the second hash. User agents without support, will receive the JS file, that then matches the first hash. (This is a rephrasing of Example 7 in the SRI specification. I've quited it for this example and rephrased for clarity, but feel free to read the original source!) |
@Yoric Currently, in Chrome/Chromium, the hashes are checked once, after the network has delivered the last byte to the renderer, just before the resource is being used. There is a very annoying but hard to fix bug where sometimes that doesn't work and we reload and recheck the resource. The intent is to move this 'lower' into the browser process or network service, although I'm not sure if or when this is happening. @kannanvijayan Granted, one can see the "content encoding" thing either way. One additional thought: Hashes apply universally to all resource types, and have well-understood security properties, and are hard to mis-use. I bet that once a js-equals-binast-hash is created, some clown will create a pair of .css files (or other resources) that are equivalent under that hash but have otherwise quite different properties. And while obviously a js-specifc hash shouldn't be applied to non-js resources, similar things have happened elsewhere (e.g. MIME-type confusion attacks) and this might lead to similar problems. @mozfreddyb Yes, that works as of today. I think the use case implied here is that 'binary AST' can be applied transparently by the web server or a CDN, just like those instances could decide to apply gzip without requiring the page author to change the page. I find that a super valid use case, and without a capability like that deployment will be a good bit harder. But so far I'm not seeing a good mechanism that would facilitate that. -- Generally speaking, I expect a custom hash with any appreciably complexity is going to be a very hard sell, to both implementor and security communities. |
@mozfreddyb I did not realize the integrity attribute supported multiple hashes! Thanks for bringing that to our attention. As @otherdaniel noted, it doesn't get us to full mimetype-only level transparency, but it's still a far step above another hint attribute on script tags. Good to know! |
How would a binary AST encoded resource be delivered when SRI is involved? How should the hash be calculated on both the encoder and the decoder sides?
The text was updated successfully, but these errors were encountered: