-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HEIST #64
Comments
I could use some help understanding how this attack works cross-origin. Shouldn't the fetch/XHR/resource timing/etc. automatically and instantly fail for such resources? (Assuming they are not shared via CORS.) |
CSP headers would be needed as they handle the outbound rules. GET and POST aren't blocked by CORS. |
So one problem is that |
I think whatwg/fetch#355 is the solution here, though it would be interesting to hear from @tomvangoethem and Mathy Vanhoef why they think that is infeasible (per section 4.1.1 of their paper). |
As @jakearchibald mentions that issue has obvious perf implications that are not desirable. Arguably resource timing should not expose responseEnd for no-cors, especially since it's not always readily available (it's not with None of this is ideal obviously. |
Would padding end impact render? The rendering of an iframe can't be detected. The width/height of an image can, but that's way before response end |
The problem is response timing is designed to give an accurate number here, and that's baaad. Others we can (hopefully) make less accurate. |
I'm wondering if we're not trying to route around the problem in hope that it would go away instead of tackling it. As mentioned in #64 (comment), RTTs can be measured without any fancy APIs, using So, unless we're willing to delay certain events (and observable implications of resource loading) in tens of milliseconds to all users, this type of timing attack is not going away. The underlying issue is that with BREACH, exposing response sizes equals exposure of CSRF tokens and login cookies. Are all types of compression vulnerable to BREACH? Can we devise a compression scheme that won't be? (e.g. by adding random padding chars of random size |
@yoavweiss can you produce a demo of this attack using only |
Delaying iframe load events seems ok, as long as it's fired after the frame has loaded. Img is only useful if the response is a valid image, otherwise error can happen early. |
An image resource on the same host as the HTML under attack can be used to measure the RTT, as demonstrated in the article @igrigorik linked to. Since that article, img loading became async (in most browsers), but an attacker could probably use RAF in order to estimate the time gap between adding the img and when it's actually triggered. That code snippet above ignores potential connection establishment time which can skew its results, but that can be mitigated with That would give an attacker one piece of the puzzle, which is the base RTT. And even if we'd delay @domenic - I hope that makes what I meant by "RTTs can be measured without any fancy APIs, using Regarding delaying @jakearchibald - would we also need to introduce similar delays to the time |
RTT is not that useful on its own I think. What is important is time-to-headers, time-to-end-of-response-body, the difference between those, how much we expose in a single roundtrip, how much is exposed on several roundtrips, and how reliable those all are. We can prevent exposing time-to-headers for no-cors, but the cost is a performance hit when you use service workers. (And an argument has been made that we also expose time-to-headers when you enable CORS, since CORS will fail at that point, but the request will also be different.) We can also prevent exposing time-to-end-of-response-body for no-cors, at no cost, but then the question is whether you can still reliably determine it across several roundtrips and whether that is problematic. |
Excusing my ignorance in advance - If the purpose of resource timing is to provide performance metrics (effectively a reporting tool) to a web dev/web admin, why does the API have to return those statistics directly to the requester? Shouldn't the resource timing data be sent to a specified (by the page) directive, and not necessarily back to the requester? |
Good point that RTT and time-to-headers are only ~identical with static resources, where HEIST mostly targets dynamic components of a page (CSRF, cookies) which generation can add server-side time.
Can you elaborate on that? |
By not handing out responseEnd. That way if you get time-to-headers with |
@djohns14 I'm not sure what you're suggesting, but changing the fundamental nature of any API under discussion is not an option. |
I don't think a CORS request and a no-cors request are different enough to matter much, but maybe I'm missing something. In terms of response end, we need to delay cache put, which is fine as it isn't all that perf sensitive. We might need to think about appcache writes too. |
From the peanut gallery, it does seem strongly inadvisable to try to just monkey-patch around this without articulating your threat model or assumptions. I think @yoavweiss has the right idea, which is pointing out we have a variety of leaks. It's also important to keep in mind the context of what we're discussing, re: BREACH attacks, and the available server-side mitigations for these. I'm not suggesting we "do nothing", but rather we work from first principles to make sure the platform is both secure and consistent. Minimally, that starts with elaborating the actual threat model we're wishing to defend against. Are we attempting to stop BREACH attacks? Are we attempting to stop knowledge of response body size? etc. Once we get those principles down, collectively take a look at the platform and look where we can leak that information, so we have a sense of the damage, and then we can brainstorm mitigations. While this is very exciting, sexy, and arguably disturbing (in that "We probably should have seen this coming" feeling afforded by 20/20 hindsight), we shouldn't be reactionary. In my mind, this strikes as similar to concerns regarding privacy/tracking: concerns which are real, and grounded, but when you see something like https://www.chromium.org/Home/chromium-security/client-identification-mechanisms , you have a better appreciation for the holistic picture and where and why various piece-meal solutions ... aren't. But that's just my $.02 |
@sleevi 100% agree. I'll take a run at it... Cross-site timing attacksMake a cross-origin request and observe some properties of it: how long it takes to error, succeed, whatever. The canonical example here is to make an authenticated request (e.g. load an image) against some behind-the-login-screen resource and time the response: if the user is logged in they’ll get back the page and otherwise a login page/redirect; the delta between those responses may be large enough to learn something about the user. Practically speaking, I don’t think the UA can “defend” against this type of attack as long as it allows any form of cross-origin authenticated requests, as the timing is subject to server-specific logic and response processing. We’re not going to enforce a constant time on all responses (that’s ludicrous), and padding random deltas to responses is similarly silly.. you’d have to make those very large, and the performance implications of that would be unacceptable. The server, on the other hand, can and should protect its sensitive resources against such attacks. Set In practice these timing attacks are also hard to exploit due to variable response times, network latency, buffering delays, etc. HEIST offers an accuracy improvement: use TCP congestion window to trigger an extra RTT to learn information about the size of the resource. I still have reservations about how many queries you can practically make with this approach, but regardless, it is an improvement on what was documented earlier.. and hence server-side mitigations mentioned earlier are only ever more relevant. Compression oracle attacksThese attacks require (a) mechanism to estimate size of encrypted response and (b) ability to reflect known data within the response. HEIST outlines how you can use TCP’s congestion window to achieve (a). You then need to find a target that satisfies (b)... Mitigations: all the same as above, plus other precautions like masking your tokens on each request. Re, Re, fetch resolving on headers: the reason this one is used in HEIST is because it allows the attacker to measure [time to first byte, time to response end] delta and obtain a more accurate answer for whether the body came in within one RTT of header data or if it was split over multiple RTTs. That said, even without this mechanism, the attacker can estimate the RTT by making other requests / observing traffic and then subtract that from total fetch duration to get a similar estimate for whether a request triggered extra RTT due to exhausted CWND. The latter approach is less accurate, but my hunch is that not significantly so once you apply any statistical method (i.e. gather this across multiple responses). On the other hand, just because you know when the header came in also doesn’t mean you can reliably state that a response that is less than the current-CWND will arrive within one RTT—e.g. packet drops and reordering, high BDP between client and server, random buffer delays along the way, etc. So.. shrug, you win some, you lose some, but the exposure is still there regardless. With above in mind... We already knew that timing and guesstimates on response size can be used as a side channel and we have established best practices that app developers should deploy to protect themselves. HEIST is a reminder for why those are important and at least as far as I can tell, if the origin has already taken precautions against BREACH, then HEIST doesn’t add any new surface area. I think that's the main message and takeaway from this exercise. Discussed changes to RT/Fetch won't solve the underlying attacks. They may reduce accuracy in some instances, but (a) it's not clear how much so, and (b) they would introduce significant negative performance implications for the rest of the platform.. Which makes me think that the tradeoff, at least for the options on the table so far, is not worth it. |
The reason we didn't consider resolving Knowing the time-to-headers is indeed not a requirement. On connections where the jitter/RTT ratio is small, it shouldn't be too hard to determine whether 1 or 2 RTTs were needed (in contrast to 0 vs 1 in HEIST). As such, just knowing time-to-end-of-response-body will probably be enough. Ideally, I would like to see a general defence where it's simply not possible to perform these attacks, regardless of the possible side-channels that may be exploited. Currently, the only way I see how this would work is by disabling authenticated We are currently exploring a general technique based on leveraging the DEFLATE algorithm to counter BREACH-like attacks. However, simply knowing the length of a resource (regardless of compression) has its own security/privacy consequences (see whatwg/storage#31 for example). |
If we rebooted the web today, any cross-origin communication would require CORS, or be no-credentials like you suggest. I just don't see how we can do that now with 20 years of content depending on current behaviour. |
We could have a header to opt out of no-cors access, similar to https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Frame-Options |
I realise an opt-in isn't ideal, but some kind of |
See my From-Origin header from some years back. We want something that prevents embedding from the wrong origin, including nested browsing contexts, but does not prevent top-level navigation. Note that omitting credentials does not help with intranet resources necessarily. |
https://www.w3.org/TR/from-origin/ Seems like what we're after (if opt-in in the best we can do), but should apply to all requests, not just embedding (maybe this is the intent, the spec talks about embedding a lot). What stopped work on this? |
Yeah, it needs some tweaks, but we should not block top-level navigation. (And maybe we should continue allowing requests without credentials…) Work stopped since nobodywanted to implement. |
|
Yeah, probably. I'd like to sort out the whole popups a think a bit more and which browsing contexts end up being in a group together before firmly answering that. But definitely "noopener" has the semantics of creating a top-level rather than auxiliary browsing context. So yeah, maybe reviving that header in some form in Fetch is a good step towards offering some kind of protection against this. Especially now https://w3c.github.io/webappsec-epr/ is being parked. |
@jakearchibald no, but see my speculation above how it's likely due to CORS. |
(side note: Adam pointed out that I was linking to an outdated draft earlier in the thread, we should be referring to The HTTP Origin Header Field in RFC6454 instead). I asked Adam about the change from "MUST send" in earlier drafts to "MAY send" in the RFC: "the IETF didn't want to publish a spec that made every existing HTTP implementation non-conformant". That said, note that UA requirements section leaves this wide open:
So, change from MUST to MAY is/was a spec-compat issue. The UA is free to send
|
It would need to be on all requests right? If I'm trying to emulate |
Again, I don't think we can send |
@jakearchibald if all UA's sent
|
@igrigorik good point. I was thinking no-referrer would also remove the origin header, but it would make it explicitly null. Happy for So are you suggesting that One less header to vary on. |
Tried to write down what we could recommend to developers with existing mechanisms... Use First Party (FP) cookies for all authenticated content that's not meant to be embedded or accessible cross-origin: "Strict" mode provides strong protection, "Lax" provides reasonable protection (modulo top-level navigations). With FP in place... If Otherwise, if
BigBank.com has authenticated resources that should not be accessible cross-origin. It can either:
Do we need anything else? It seems like combination of FP, Origin, and Referrer might do the trick. |
We can only give this advice if a) all browsers implement same-site cookies, b) all browsers implement the "new" And again, this does not deal with 1) HTTP authentication, 2) TLS client certificates, 3) firewalled content. |
If we did |
Contrary to what many may believe, I do prefer solutions that avoid minting new headers. Especially so when we're talking about a header that would have to be attached on most every request—which, I think, is what were looking at here. Hence, me trying to understand if we actually need a new header, of if we can compose a solution out of existing parts.
As far as I can tell, Chrome is already effectively (b), we have (a) on the way, and in this day and age (c) goes without saying... 😎 . Further, both (a) and (b) have existing efforts behind them, so if we can give them a kick through additional use-cases/motivation, then that's a win in my books. Building our own thing will take just as long if not longer. That aside, you're right, I ignored HTTP/TLS auth and firewalled content. It wasn't clear to me how it impacts what we're discussing here.. can you elaborate a bit more? |
You could still do timing attacks on resources that are firewalled or HTTP/TLS authenticated. Basically the same as with cookies. (It's not entirely clear by the way what |
Hmm, well.. you could just drop a first-party cookie on any such resources, right? Same logic.
Yeah, that's fair, you can probably simplify my earlier post to FP + Referrer. Stepping back, we covered a lot of ground in this thread.. My takeaways:
With above in mind, I propose that we resolve this thread and go nudge folks working on FP. I can writeup a summary of our discussion here. |
It's still not clear to me when cookies would be sent with "lax" mode. What about:
I don't see how referrer helps here either. If you're vulnerable to GET, you're in a really bad place. |
The behavior is defined in https://tools.ietf.org/html/draft-west-first-party-cookies-07#section-4.3. If that's not sufficient, then we should open a bug against httpwg: https://github.com/httpwg/http-extensions/issues?q=is%3Aopen+is%3Aissue+label%3A6265bis /cc @mikewest @mnot Re, prerender: I'll followup on the linked issue. |
Updated RT spec (7358cbd) and removed the (broken) TAO-check reference -- see my very first post + linked commit on this thread for details. I believe that's the only actionable change for RT from this discussion.. If anyone else has other suggestions, let me know. |
Seems like
|
I don't understand. |
Nit: We renamed this before shipping, so if/when you write anything up, please try to minimize the confusion. :)
As noted in the bug Jake filed against SameSite, |
@annevk as in, regardless of what authentication mechanism you use (HTTP auth, TLS auth), when you first authorize the user you can just drop a SameSite cookie and then observe if it's echoed in subsequent requests to identify if its origin is same or cross origin. |
I'm starting to agree with @mikewest's remarks elsewhere that the opt-in defense is not ideal and unlikely to be deployed. But I'm not sure how we can do better, other than maybe making the opt-in not rely on new cookie infrastructure, but something closer to a boolean. |
While I'm sympathetic to @mikewest's position, I'm absolutely convinced this is not something that the browser caused, contributes to, or can fix at this point. This is, at the core, similar to something like SQL injection - it's a server choice (to compress responses), and using compression - whether over a secure channel or otherwise - leaks status information. While I appreciate @igrigorik's efforts at coming up with a threat model, I think it's utterly futile to suggest that the browser could be in a position so it could make all loads unobservably side-effect free, without the cooperation or knowledge of the server. There are going to be timing leaks throughout the system - whether at the CPU cache layer, at the IO interrupt layer from the NIC, from contention on the network, from congestion windows, from system timers - all of these things outside of the browsers' ken and remit. We simply can't design a system that, under this threat model, is constant-time, and without that aspect of constant-time, we cannot guarantee that the secrets remain secret. So we have to do one of two things - prevent requests from being made that might result in secrets being sent, or help server operators understand the risks of sending secrets. While SameOrigin or Bikeshed or whatever we want can quais-help with the former, the past two decades of the Web have also taught us that servers are, almost universally, very bad at determining what is or should be secret (case study: The opposition to HTTPS). So even if we were to prevent most (all?) forms of cross-origin credential sharing, there's going to be secrets, and so there's going to be side-channels here. Which leads to the only other option that seems at all practical, which is educating at a server side. I think the focus on the "Web Platform Features" enabling this are arguably misguided, because what we're talking about is the ability for constant-time vs variable-time, and anyone who works in that space can tell you how hard it is to make sure it's right. To the extent we could block cookies (and TLS client certs, and whatever H/2+TLS1.3 method the IETF comes up with), great, but I think we know that the risk of breakage is high for any solution, because it's not backwards compatible, and thus our option is "Opt In". Or tell servers to stop supporting compression if they can't do it securely. Or stop advertising in browsers that we support compression. |
Agreed, and my direction here has been to figure out what (if anything) we need to give server operators to make it possible to make an informed decision on their end for whether the response should be allowed to contain secret / authenticated data.
So, to clarify.. The problem, as I see it, is that we allow authenticated cross-origin requests that the origin server can't distinguish from same-origin requests. As a result, the server ends up leaking secrets because existing recommendations (e.g. look at |
A quick summary of our discussion here: https://www.igvita.com/2016/08/26/stop-cross-site-timing-attacks-with-samesite-cookies/ - tl;dr: use SameSite cookies. As outlined in #64 (comment), we are not making any changes to RT or Fetch; closing this thread. Perhaps there are other mechanisms we ought to consider, in addition to encouraging adoption of SameSite cookies, but whatever that may be.. we should take that discussion to the appropriate forum (webappsec group, probably). p.s. thanks everyone for your help and input! |
HEIST paper, ArsTechnica coverage, Twitter discussion
Reading through the paper, the core observation is: TCP congestion control can be (ab)used to infer size of a (cross-origin) resource. Roughly:
Given both of the above, you can compute the time delta between when the headers were received and response end:
If TCP connection is brand new, the above can effectively tell you if response is <14KB. An extended version of this for beyond IW10 is:
Aside, but related, it’s possible to estimate RTT via javascript: if you know the RTT and have a model for what you expect the congestion window to be, you can use that observe if response took >1 RTT and arrive at same results with similar padding technique (knowledge of when response headers are received improves accuracy, but not necessary).
Armed with above, you can apply a compression oracle attack against the origin, ala Breach. That said, I’m dubious of the claims in the paper about how practical this actually is: tripping over a congestion window boundary doubles said congestion window and that ramps quickly and hence the query rate should be low... Am I missing something here?
Mitigations: all existing BREACH recommendations apply.
In terms of practical implications... /cc @annevk @slightlyoff @domenic
Last but not least, scrubbing through RT spec it looks like we introduced a bug in 88bb585?diff=split#diff-eacf331f0ffc35d4b482f1d15a887d3bL543 ~implying that responseEnd is subject to TAO. It's not.. Unless @plehegar had something else in mind here?
The text was updated successfully, but these errors were encountered: