-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
request: allow to retrieve a frameID
from an <iframe>
element
#12
Comments
Side note: even if Getting the frameID for a document embedded through |
Another way to do this is to use the webNavigation.getAllFrames(), but this requires the webNavigation permission. Offhand I'm not sure what permission prompt other browsers display for this, but Chrome warns users that this permission allows extensions to "Read your browsing history." Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using this data. Exposing a way for the host frame to get it's own frame ID sounds reasonable. We may also want to investigate a more ergonomic way of getting all frame IDs for a given tab than the current webNavigation approach, as that grants access to much more data than necessary to, say, properly target script injection for a given frame. |
We're a Password Manager. When a form contains iframes themselves containing form fields (well-known example visible at https://account.protonmail.com/signup?language=en but many banks do that too), we need to "replace" an iframe in such a form by its inner form fields when we "recognize" forms at the outermost level. Reaching the frameElement is often impossible and a direct postMessage() to the iframe's window can be unreliable if the contents of the iframe rely themselves on messages and do not accept easily third-party messages (well-known example is Office365, ahem...). So one of the only reliable things to associate an iframe element and its inner contents is the frameID. All Password Managers share the same burden. As of today, of course, we ping/pong the background to get the frameID but that feels like a hack: in short you send a dummy message to the background to grab the originating frameID from the message and send it back. |
100% agree with this suggestion, this change would make it easier to implement secure messaging between content scripts in cross domain iframe environments. It is currently very difficult with the messaging hack mentioned above being one of the ways to do it. |
We face exactly the same problem at 1Password, working on our own password manager. Some sites split username and password fields in login forms across frames, and for the purposes of filling, we need a reliable way of knowing where each is located. Additionally, some of the UI we display is added to the DOM of the top frame, but positioned based on the location of a field within an iframe. For this, we need to add the offset of the field within the frame to the offset of the frame itself in the page. Doing so requires that we identify the particular iframe we are showing UI for. Workarounds include comparing the size of an iframe with the size of the window within it. Ideally, however, this would be as simple as using the frame ID that sent us the message. |
Dark Reader does this too.
It would be great solution. Perhaps, it can live under |
frameID
from an <iframe>
element
I tried to design something to address all mentioned use cases, but still keep it minimal to enable quick and easy adoption by other implementations. My assumptions are:
So here's the proposal:
This returns the You can pass the One potential issue is that AFAIKT this would be the first extension API that receives a DOM element as an argument. I had to work around a minor issue (for cross-origin values), and it might be a larger issue in other engines. I've put together a quick prototype, you can test it yourself if you're able to build Firefox locally. (That patch also includes an alternative design for |
Not only "probably nice to have" but absolutely necessary. Let me explain....
This is doable but only through |
I forgot you can actually download builds from our CI infra for Linux, OSX or Windows -- though I only tested the last one, please ping me if something doesn't work for you.
Thank you for the clarification. I didn't notice that requirement mentioned in above use cases, but while trying to work through them, I realized it's probably important, so thanks for confirming. |
That would perfectly fit our needs. Thanks a lot @zombie !!! |
This works for the Safari team. |
One point of feedback on the API design. The parameter being a mixed type and potentially optional may be problematic. Consider the following example: let frameId = chrome.runtime.getFrameId(window.opener); The intention behind this code is to get the frameId of the window that opened this page. This could be resolved by disambiguating the parameters, by being explicit about which frame is meant. For example by always requiring a parameter instead of Or by being even more explicit, taking a dictionary that does not have "unsafe" fallbacks (i.e. overlap between optional parameters and different results). For example: // Get frameId of current window.
let frameId = chrome.runtime.getFrameId({
type: "self",
});
let frameId = chrome.runtime.getFrameId({
type: "windowProxy",
windowProxy: window, // or parent / opener / top / frames[x] / iframe.contentWindow / event.source, ...
});
let frameId = chrome.runtime.getFrameId({
type: "htmlElement",
htmlElement: iframe, // One of: <iframe> / <frame> / <embed> / <object>
}); |
I wonder if the dual type argument, |
As mentioned in the example in my last comment, there are more ways to embed a document other than
If that issue gets resolved, then |
Ok, thanks a lot @Rob--W . |
Good callout, the <no params to get current frame's id> is a leftover from when the method only worked with |
@dotproto Can you please check with the Chrome team if this would present implementation difficulties, or if they have any other feedback to this proposal? |
This discussion sounds similar to what has been done for closed shadow roots. Closed shadow roots are meant to be inaccessible from main world, and it has been requested to allow extensions to access them. It is now available on I mostly agree with spec writers' rationale for not wanting to expose Previous proposals add unnecessary overhead to the API call just for one edge case: Making API polymorphic will likely require browser's wrapper code to introduce a type check, and requiring consumers to pass a one-off dictionary seems bad for garbage collection. Moreover, once we have an API for accessing For completeness's sake, below is one possible API design that I just came up with. chrome.dom.contentWindow(element:HTMLIFrameElement | HTMLFrameElement |
HTMLObjectElement | HTMLEmbedElement): Window | null
interface HTMLEmbedElement {
contentWindow: Window | null
} For extending existing |
As mentioned here: @dotproto created a crbug for the proposal here: |
Chiming in from the Chromium side. I was able to bring this up with our security team, and there were a few different considerations. First off, to explain a few of our primary considerations: First, we often try to assume that almost any renderer process can be compromised (and, it turns out, that this is pretty much true : )). Because of these, we try to limit the amount of information and capabilities we give to renderer processes. In particular, it ideally shouldn’t be possible for a compromised renderer process to access or affect other renderer processes. Second, the boundary between content scripts and the main page is also weaker, since it’s only a JS context boundary (and not a hard process boundary). In either the case of a compromised renderer or just a “pierced” boundary between JS contexts, an attacker can potentially gain access to any power or knowledge that a content script has. These are the reasons that the vast majority of extension APIs aren’t accessible from a content script context and that we’ve adjusted the cross-origin fetching capabilities of content scripts, and that we are generally opposed to expanding those capabilities. In this case, though, it’s a bit of a blurred line. For one thing, I can absolutely see the benefit and desire here for extension developers - dealing with frame IDs is painful, and if there’s frame-specific logic you need in content scripts, it’s even more so (and sometimes impractical, if you need it synchronously for a script running at document_start). There's also an argument that, at least for retrieving the current frame's ID, it's not really an "extra" capability (though this is somewhat inaccurate for Chromium; discussed below). From a platform viewpoint, I’m supportive of exposing this information to ease development. Now, though, we get into the nitty gritty. In Chromium, frame IDs (apart from the main frame, which is always 0) are currently associated with something called the Frame Tree Node ID. This a globally-unique, monotonically-increasing identifier given to a frame. If we expose this information to a compromised renderer, it can give away information about the current state of the browser, as well as being able to detect happenings in other tabs (by adding new frames and seeing the delta from the previous ones). This type of information can then help attackers mount timing attacks, among others. A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist. It’s true that these are largely only security concerns when one (or frequently, more than one) thing already goes wrong, but we very much strive to have defense-in-depth. We’ve historically seen many vulnerabilities that chain together multiple things going wrong. Now, for the path forward. On the Chromium side, I think many of these concerns would be assuaged if we had an unguessable identifier associated with frame IDs. I think this is an approach that we could take (I’ll investigate it), but we’d likely block runtime.getFrameId() on that being completed. Even then, there could be a slight (though significantly mitigated) concern around allowing frames to retrieve the unique identifier of other frames in the hierarchy, especially ones that might be cross-origin and/or cross-process; I’m continuing the discussion there to see if, with an extension-specific identifier here, that’s a reasonably low risk. I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations? |
Thank you Devlin for the Chromium perspective. I think your proposal for using random frame IDs is compatible with the Since additional exposure seems marginal (at least for Firefox), and using random IDs would be an implementation-level change transparent to extension, it doesn't look like a blocker for the proposed design. I think we can proceed with shipping this as an experimental API to get it into the hands of developers, to test it and provide any potential feedback.
I'm somewhat confused by this. My presumption based on talking to Gecko's site isolation team is that these frame node IDs (or Browsing Context IDs I mentioned in the proposal above) would already be available in each render process, since we already need to support Of course this could be mitigated by parent process generating a per-process mapping of frame node IDs, or maybe with something similar to what you're proposing about randomized IDs exposed to extensions, and Chromium might already be doing something along those lines, but at least Gecko isn't (perhaps partially because of the second point below).
They are not random in Gecko either, but they're (mostly) monotonically-increasing only per-process. At least for the case you described, a compromised child process wouldn't be able to observe the creation of frames initiated from other processes. |
|
If the iframe has other message listeners, they could be calling stopPropagation/stopImmediatePropagation and preventing your listener from being called. To make it work reliably you would need to make sure your content script runs before anything else on the page (document_start), and use "capture" in the event listener. |
Yeah, we used document_start iirc |
an iframe without a @src set ? curious :) |
@hamax here is my manifest: |
@sublimator I did not introduce the scope of the project but I'm working on a research program which wants to track ads inserted in webpages. |
I was about to say, "Have you tried setting match_about_blank: true", but I see ... |
@sublimator yes I need to dig deeper thanks! my life would be so easier if we had the chrome.runtime.getFrameId ready :) |
I hear ya! |
@oliverdunk any chance you can give a kick to this feature? everybody has it except chrome since quite a while... |
I'll try to bring it up :) |
thanks @oliverdunk there are clearly situations where we are stuck getting the frameId with workarounds only. |
Would be very great if at least I'm posting it just in an attempt to draw some more attention to this old issue that makes all of us to use workarounds |
To bump this up, here is my use case: When a user sends a shortcut command to my extension, no information at all about the active frame/selection/etc... so it's impossible to know what frame to execute a script into. This is understandable behaviour since the shortcut could be hit without anything marked, but in order to handle the request properly I currently need to try and drill down through frames checking for .contentWindow all the way. The problem happens as soon as I hit a cross-origin frame. Since the content script is running inside the page's context, the extension's site permissions do not apply and it's not possible to break through. What I need is to have the frameID of the element blocking the content script so I can inject directly into the frame instead. The only way of doing that currently (it seems) is to inject message listeners into all frames on a page or Try and spy the frame using webNavigation. The first approach is convoluted and the second is not 100% reliable incase of duplicate frames as well as currently requiring additional permissions that aren't necessary for the extension to function otherwise. I'm currently handling it with optional permissions, but I'd really like to require as few permissions as possible and getting the frameID of cross-origin frames would make that much, much easier. Since even Safari's on board, I don't see why Chrome should lag behind on this. |
I wonder if Google would feel more comfortable with documentId ? Something along the lines of: function getTarget(target: Window): Promise<{tabId: number, documentId: string}> Given it's already a random string? That would likely suffice for most people. It's relatively simple to trade up for a frameId if you really need one. |
Of course that's only going to work for MV3, but I guess at this point, that's fine? |
@sublimator To find the permalink, link to the file instead of the pull request. Github had automatically collapsed the diff in the PR because of its large size. webextensions/_minutes/2024-03-18-san-diego-meetup.md Lines 265 to 269 in 5869199
Why would this only work with MV3?
What kind of status are you looking for? There is an issue called "Limited event pages" and also another one about neutral background pages, with relevant discussion. |
Thanks @Rob--W Thanks for correcting me as it was not my intent to mislead anyone there :) I spoke too carelessly. I was thinking you couldn't use
https://developer.chrome.com/docs/extensions/mv2/reference/tabs#method-sendMessage |
One of the problems we faced at Dashlane is dealing with frameIDs. It's currently super tricky to associate a frameID to a given <iframe> element and it's even too complicated to retrieve a frameID at all since the only existing way to do that ATM is to send a message, from the inner context to that iframe, to the background, collect the frameID there and send it back to the origin iframe.
I think there should be an easy way to retrieve a frameID for a given iframe element, from the enclosing context of that iframe element. Similarly, from the JS context inside an iframe, there should be an easy way to retrieve the current frameID without having to ping-pong the background.
The text was updated successfully, but these errors were encountered: