Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request: allow to retrieve a frameID from an <iframe> element #12

Open
therealglazou opened this issue Jun 17, 2021 · 59 comments
Open

request: allow to retrieve a frameID from an <iframe> element #12

therealglazou opened this issue Jun 17, 2021 · 59 comments
Assignees
Labels
enhancement Enhancement or change to an existing feature follow-up: chrome Needs a response from a Chrome representative implemented: firefox Implemented in Firefox implemented: safari Implemented in Safari topic: frame targeting

Comments

@therealglazou
Copy link

therealglazou commented Jun 17, 2021

One of the problems we faced at Dashlane is dealing with frameIDs. It's currently super tricky to associate a frameID to a given <iframe> element and it's even too complicated to retrieve a frameID at all since the only existing way to do that ATM is to send a message, from the inner context to that iframe, to the background, collect the frameID there and send it back to the origin iframe.

I think there should be an easy way to retrieve a frameID for a given iframe element, from the enclosing context of that iframe element. Similarly, from the JS context inside an iframe, there should be an easy way to retrieve the current frameID without having to ping-pong the background.

@therealglazou therealglazou changed the title request: impossible to retrieve a frameID from an <iframe> element request: allow to retrieve a frameID from an <iframe> element Jun 17, 2021
@therealglazou
Copy link
Author

therealglazou commented Jun 18, 2021

Side note: even if <embed> is deprecated, some sites still use it (yeah, fools...). Example: https://www2.rxnt.com/phr/

Getting the frameID for a document embedded through <embed> is even more complicated because it's not possible to postMessage() the inner window.

@dotproto
Copy link
Member

dotproto commented Jun 18, 2021

… it's even too complicated to retrieve a frameID at all since the only existing way to do that ATM is …

Another way to do this is to use the webNavigation.getAllFrames(), but this requires the webNavigation permission. Offhand I'm not sure what permission prompt other browsers display for this, but Chrome warns users that this permission allows extensions to "Read your browsing history."

Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using this data.

Exposing a way for the host frame to get it's own frame ID sounds reasonable. We may also want to investigate a more ergonomic way of getting all frame IDs for a given tab than the current webNavigation approach, as that grants access to much more data than necessary to, say, properly target script injection for a given frame.

@therealglazou
Copy link
Author

Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using thi data.

We're a Password Manager. When a form contains iframes themselves containing form fields (well-known example visible at https://account.protonmail.com/signup?language=en but many banks do that too), we need to "replace" an iframe in such a form by its inner form fields when we "recognize" forms at the outermost level. Reaching the frameElement is often impossible and a direct postMessage() to the iframe's window can be unreliable if the contents of the iframe rely themselves on messages and do not accept easily third-party messages (well-known example is Office365, ahem...). So one of the only reliable things to associate an iframe element and its inner contents is the frameID. All Password Managers share the same burden.

As of today, of course, we ping/pong the background to get the frameID but that feels like a hack: in short you send a dummy message to the background to grab the originating frameID from the message and send it back. getAllFrames() is hardly usable because of the permission constraints it adds on our webextension. Maybe getAllFrames() should not live inside webNavigation. And even if you can capture the frameID that way, you can capture from within the iframe, it's still not directly associated to the iframe element at the upper level...

@dylanb
Copy link

dylanb commented Jun 21, 2021

100% agree with this suggestion, this change would make it easier to implement secure messaging between content scripts in cross domain iframe environments. It is currently very difficult with the messaging hack mentioned above being one of the ways to do it.

@oliverdunk
Copy link
Member

We face exactly the same problem at 1Password, working on our own password manager. Some sites split username and password fields in login forms across frames, and for the purposes of filling, we need a reliable way of knowing where each is located.

Additionally, some of the UI we display is added to the DOM of the top frame, but positioned based on the location of a field within an iframe. For this, we need to add the offset of the field within the frame to the offset of the frame itself in the page. Doing so requires that we identify the particular iframe we are showing UI for. Workarounds include comparing the size of an iframe with the size of the window within it. Ideally, however, this would be as simple as using the frame ID that sent us the message.

@bershanskiy
Copy link
Member

As of today, of course, we ping/pong the background to get the frameID

Dark Reader does this too.

Maybe getAllFrames() should not live inside webNavigation.

It would be great solution. Perhaps, it can live under tabs? Also, "tabs" permission could gate some information (like URL)like it already does Tab.url, Tab.title, and Tab.faviconUrl.

@dotproto dotproto added the enhancement Enhancement or change to an existing feature label Aug 5, 2021
@xeenon xeenon changed the title request: allow to retrieve a frameID from an <iframe> element request: allow to retrieve a frameID from an <iframe> element Sep 24, 2021
@zombie
Copy link
Collaborator

zombie commented Sep 29, 2021

I tried to design something to address all mentioned use cases, but still keep it minimal to enable quick and easy adoption by other implementations. My assumptions are:

  • Content scripts want frameId of the current frame.
  • Also a way to get frameId for all descendant <iframes>.
  • (probably nice to have) Get frameId for parent and ancestor frames.
  • (desirable because permissions) Avoid the webNavigation namespace.
  • (if possible) Make it a sync method, it would be used for communication between frames on page load.

So here's the proposal:

browser.runtime.getFrameId(target: WindowProxy | HTMLIFrameElement): number

This returns the frameId from the passed Window or container element, or for the current (caller) frame if omitted. Since the Browsing Context subtree already needs to be available in the process, it should be possible to make this a sync method in all implementations.

You can pass the <iframe> directly if you're interested in positioning inside the page, window.parent.parent... to get all ancestor frames, or any descendant WindowProxy from the postMessage() event.source.

One potential issue is that AFAIKT this would be the first extension API that receives a DOM element as an argument. I had to work around a minor issue (for cross-origin values), and it might be a larger issue in other engines.

I've put together a quick prototype, you can test it yourself if you're able to build Firefox locally.

(That patch also includes an alternative design for getFrameInfo I was considering, which enables enumerating all child frames, but it turned out overall more complex and less capable, and it's not clear that's a real use case).

@therealglazou
Copy link
Author

therealglazou commented Sep 30, 2021

  • (probably nice to have) Get frameId for parent and ancestor frames.

Not only "probably nice to have" but absolutely necessary. Let me explain....

  • injected JS code needs to use the various origins of the frames in a page (or worse "*") to postMessage() a parent frame because otherwise the messages will always trigger an origin error.
  • because of that, Website JS cannot make any difference between messages sent by the site itself and messages sent by an extension
  • most websites, including MAJOR ones (Microsoft, Google, etc.) do not make any coherence control on incoming messages, ie, verify that incoming messages are conformant to an expected grammar AND do nothing if they don't. As an example, a major website, a bank I won't name, has a very weird behaviour: it does one thing on incoming string messages it understands, another one on incoming messages being a number and always refresh the page in ALL other cases... So if a WebExtension sends itself, from a frame to another, a JSON, k-boom the website gets a refresh. No, we won't be able to evangelize the whole world on good postMessage() handling practices, that's too late.
  • in other words, postMessage() to communicate between frames is totally unusable in WebExtensions.
  • a way to associate a <frame> or <iframe> element to a frameId goes through finding the index of a frame's window into its window.parent.frames array. BUT window.frames and document.querySelectorAll('frame,iframe') are NOT in sync, contrarily to what developer docs are saying. Frame windows seem to be added to that array at creation time while the other list is in tree traversal order.... It's then completely impossible to reliably use that path.
  • conclusions: 1. only sendMessage() is reliable between frames inside WebExtension code, postMessage() just cannot be used, it breaks websites; 2. it's crucial for each frame to have its frameId and its parentFrameId (communication upwards); 3. it's crucial for extension code to be able, for any arbitrary <frame> or <iframe> element, to get its frameId (communication downwards)

This is doable but only through webNavigation permission (that many extensions are reluctant to ask for given the scary message browsers show) or non-blocking webRequest. In both cases, that's hacky and complicated for something that should be trivial. And we have absolutely nothing to fix the conclusion3 issue ATM.

@zombie
Copy link
Collaborator

zombie commented Sep 30, 2021

you can test it yourself if you're able to build Firefox locally.

I forgot you can actually download builds from our CI infra for Linux, OSX or Windows -- though I only tested the last one, please ping me if something doesn't work for you.

Not only "probably nice to have" but absolutely necessary. Let me explain....

Thank you for the clarification.

I didn't notice that requirement mentioned in above use cases, but while trying to work through them, I realized it's probably important, so thanks for confirming.

@therealglazou
Copy link
Author

therealglazou commented Sep 30, 2021

browser.runtime.getFrameId(target?: WindowProxy | HTMLIFrameElement): number

That would perfectly fit our needs. Thanks a lot @zombie !!!

@xeenon
Copy link
Collaborator

xeenon commented Sep 30, 2021

This works for the Safari team.

@Rob--W
Copy link
Member

Rob--W commented Sep 30, 2021

One point of feedback on the API design.

The parameter being a mixed type and potentially optional may be problematic. Consider the following example:

let frameId = chrome.runtime.getFrameId(window.opener);

The intention behind this code is to get the frameId of the window that opened this page.
However, window.opener can be null, in which case the API would return the current frame.
This unexpected result can potentially cause the extension to use the wrong frameId, and therefore the wrong frame/window.

This could be resolved by disambiguating the parameters, by being explicit about which frame is meant.

For example by always requiring a parameter instead of null (i.e. getFrameId(self) aka getFrameId(window) to get the current frame's windowId).

Or by being even more explicit, taking a dictionary that does not have "unsafe" fallbacks (i.e. overlap between optional parameters and different results). For example:

// Get frameId of current window.
let frameId = chrome.runtime.getFrameId({
  type: "self",
});

let frameId = chrome.runtime.getFrameId({
  type: "windowProxy",
  windowProxy: window, // or parent / opener / top / frames[x] / iframe.contentWindow / event.source, ...
});

let frameId = chrome.runtime.getFrameId({
  type: "htmlElement",
  htmlElement: iframe, // One of: <iframe> / <frame> / <embed> / <object>
});

@therealglazou
Copy link
Author

I wonder if the dual type argument, WindowProxy | HTMLIFrameElement, is really necessary. Since chrome.runtime.getFrameId(frameElement) and chrome.runtime.getFrameId(frameElement.contentWindow) should be strictly equivalent, maybe WindowProxy is enough.

@Rob--W
Copy link
Member

Rob--W commented Oct 1, 2021

I wonder if the dual type argument, WindowProxy | HTMLIFrameElement, is really necessary. Since chrome.runtime.getFrameId(frameElement) and chrome.runtime.getFrameId(frameElement.contentWindow) should be strictly equivalent, maybe WindowProxy is enough.

As mentioned in the example in my last comment, there are more ways to embed a document other than <iframe>. For frames (<iframe> and <frame>), it's indeed possible to get the window with iframe.contentWindow. It's even possible to do the same for <object>. For <embed>, that's not possible. Interestingly, documents from embeds do appear in the window.frames object.

<embed> used to be used to embed plugins in web pages. Plugins can expose a scripting interface, so up until a few years ago, it was not really safe to expose more properties. Now that support for plugins have been dropped from browsers, it may be feasible to add properties such as contentWindow and contentDocument to <embed>, which would indeed make the htmlElement kind of parameter redundant. I filed an issue with a request to add .contentWindow to <embed> at whatwg/html#7140

If that issue gets resolved, then getFrameId only needs to take a window.

@therealglazou
Copy link
Author

Ok, thanks a lot @Rob--W .

@zombie
Copy link
Collaborator

zombie commented Oct 1, 2021

For example by always requiring a parameter instead of null (i.e. getFrameId(self) aka getFrameId(window) to get the current frame's windowId).

Good callout, the <no params to get current frame's id> is a leftover from when the method only worked with EmbedderElements, and since you suggested including WindowProxy, it's not really needed anymore (I've edited the original proposal to make it clearer).

@zombie
Copy link
Collaborator

zombie commented Oct 1, 2021

@dotproto Can you please check with the Chrome team if this would present implementation difficulties, or if they have any other feedback to this proposal?

@theseanl
Copy link

theseanl commented Oct 2, 2021

This discussion sounds similar to what has been done for closed shadow roots. Closed shadow roots are meant to be inaccessible from main world, and it has been requested to allow extensions to access them. It is now available on chrome.dom.openOrClosedShadowRoot from Chrome's side, and Element.openOrClosedShadowRoot from Firefox's side.

I mostly agree with spec writers' rationale for not wanting to expose contentWindow on HTMLEmbedElement. I propose that browser venders should implement a corresponding chrome.dom.contentWindow / HTMLEmbedElement.contentWindow API to allow content scripts to access content window of <embed> nodes.

Previous proposals add unnecessary overhead to the API call just for one edge case: Making API polymorphic will likely require browser's wrapper code to introduce a type check, and requiring consumers to pass a one-off dictionary seems bad for garbage collection.

Moreover, once we have an API for accessing contentWindow of <embed> nodes, developers may find other good uses of it.

For completeness's sake, below is one possible API design that I just came up with.

chrome.dom.contentWindow(element:HTMLIFrameElement | HTMLFrameElement |
    HTMLObjectElement | HTMLEmbedElement): Window | null

interface HTMLEmbedElement {
	contentWindow: Window | null
}

For extending existing HTMLEmbedElement, one may consider using a different property name to signal developers that this is an extension-only API.

@carlosjeurissen
Copy link
Contributor

As mentioned here:
#77 (comment)

@dotproto created a crbug for the proposal here:
https://bugs.chromium.org/p/chromium/issues/detail?id=1256872

@rdcronin
Copy link
Contributor

Chiming in from the Chromium side.

I was able to bring this up with our security team, and there were a few different considerations. First off, to explain a few of our primary considerations:

First, we often try to assume that almost any renderer process can be compromised (and, it turns out, that this is pretty much true : )). Because of these, we try to limit the amount of information and capabilities we give to renderer processes. In particular, it ideally shouldn’t be possible for a compromised renderer process to access or affect other renderer processes.

Second, the boundary between content scripts and the main page is also weaker, since it’s only a JS context boundary (and not a hard process boundary). In either the case of a compromised renderer or just a “pierced” boundary between JS contexts, an attacker can potentially gain access to any power or knowledge that a content script has.

These are the reasons that the vast majority of extension APIs aren’t accessible from a content script context and that we’ve adjusted the cross-origin fetching capabilities of content scripts, and that we are generally opposed to expanding those capabilities.

In this case, though, it’s a bit of a blurred line. For one thing, I can absolutely see the benefit and desire here for extension developers - dealing with frame IDs is painful, and if there’s frame-specific logic you need in content scripts, it’s even more so (and sometimes impractical, if you need it synchronously for a script running at document_start). There's also an argument that, at least for retrieving the current frame's ID, it's not really an "extra" capability (though this is somewhat inaccurate for Chromium; discussed below). From a platform viewpoint, I’m supportive of exposing this information to ease development.

Now, though, we get into the nitty gritty. In Chromium, frame IDs (apart from the main frame, which is always 0) are currently associated with something called the Frame Tree Node ID. This a globally-unique, monotonically-increasing identifier given to a frame. If we expose this information to a compromised renderer, it can give away information about the current state of the browser, as well as being able to detect happenings in other tabs (by adding new frames and seeing the delta from the previous ones). This type of information can then help attackers mount timing attacks, among others. A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist.

It’s true that these are largely only security concerns when one (or frequently, more than one) thing already goes wrong, but we very much strive to have defense-in-depth. We’ve historically seen many vulnerabilities that chain together multiple things going wrong.

Now, for the path forward. On the Chromium side, I think many of these concerns would be assuaged if we had an unguessable identifier associated with frame IDs. I think this is an approach that we could take (I’ll investigate it), but we’d likely block runtime.getFrameId() on that being completed. Even then, there could be a slight (though significantly mitigated) concern around allowing frames to retrieve the unique identifier of other frames in the hierarchy, especially ones that might be cross-origin and/or cross-process; I’m continuing the discussion there to see if, with an extension-specific identifier here, that’s a reasonably low risk.

I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations?

@zombie
Copy link
Collaborator

zombie commented Oct 28, 2021

Thank you Devlin for the Chromium perspective. I think your proposal for using random frame IDs is compatible with the getFrameId() design, and something we could implement if/when we decide to (though it might not be a priority for us, see notes below for mitigating circumstances in Gecko).

Since additional exposure seems marginal (at least for Firefox), and using random IDs would be an implementation-level change transparent to extension, it doesn't look like a blocker for the proposed design. I think we can proceed with shipping this as an experimental API to get it into the hands of developers, to test it and provide any potential feedback.

A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist.

I'm somewhat confused by this. My presumption based on talking to Gecko's site isolation team is that these frame node IDs (or Browsing Context IDs I mentioned in the proposal above) would already be available in each render process, since we already need to support WindowProxy references to each frame in the same subtree (or perhaps even the whole Browsing Context Group), including cross-origin ones. In case of a compromised render process you mention, these IDs would already be in the process's memory and available to the attacker.

Of course this could be mitigated by parent process generating a per-process mapping of frame node IDs, or maybe with something similar to what you're proposing about randomized IDs exposed to extensions, and Chromium might already be doing something along those lines, but at least Gecko isn't (perhaps partially because of the second point below).

I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations?

They are not random in Gecko either, but they're (mostly) monotonically-increasing only per-process. At least for the case you described, a compromised child process wouldn't be able to observe the creation of frames initiated from other processes.

@addeective
Copy link

@sublimator

What did you set as targetOrigin ?
targetOrigin '*' (unfortunately I need to communicate from any origin.)

It's probably not timing issues, if you are getting consistent results. Cross origin restrictions? Do you see any commonality between the pages that don't work?
I'm on a blank src commonality lead at the moment.

@hamax
Copy link

hamax commented Oct 13, 2023

If the iframe has other message listeners, they could be calling stopPropagation/stopImmediatePropagation and preventing your listener from being called. To make it work reliably you would need to make sure your content script runs before anything else on the page (document_start), and use "capture" in the event listener.

@sublimator
Copy link
Member

Yeah, we used document_start iirc

@sublimator
Copy link
Member

I'm on a blank src commonality lead at the moment.

an iframe without a @src set ? curious :)

@addeective
Copy link

@hamax here is my manifest:
{ "matches": [ "<all_urls>" ],
"js": [
'adsfilter.bundle.js'],
"match_about_blank": true,
"all_frames": true,
"run_at": "document_start"
}

@addeective
Copy link

@sublimator I did not introduce the scope of the project but I'm working on a research program which wants to track ads inserted in webpages.
I assume those empty urls iframes are dynamic frames built by some advertisor javascript

@sublimator
Copy link
Member

I was about to say, "Have you tried setting match_about_blank: true", but I see ...
Shrug, try setting up a controlled context with your own blank iframe and see if you can send it a message from the dev tools?

@addeective
Copy link

addeective commented Oct 13, 2023

I was about to say, "Have you tried setting match_about_blank: true", but I see ... Shrug, try setting up a controlled context with your own blank iframe and see if you can send it a message from the dev tools?

@sublimator yes I need to dig deeper thanks!

my life would be so easier if we had the chrome.runtime.getFrameId ready :)

@sublimator
Copy link
Member

I hear ya!

@addeective
Copy link

addeective commented Oct 13, 2023

@oliverdunk any chance you can give a kick to this feature? everybody has it except chrome since quite a while...

@oliverdunk
Copy link
Member

@oliverdunk any chance you can give a kick to this feature? everybody has it except chrome since quite a while...

I'll try to bring it up :)

@addeective
Copy link

@oliverdunk any chance you can give a kick to this feature? everybody has it except chrome since quite a while...

I'll try to bring it up :)

thanks @oliverdunk there are clearly situations where we are stuck getting the frameId with workarounds only.

@AntonKhurs
Copy link

Would be very great if at least getAllFrames would be moved outside from webNavigation. Currently have to take a lot of unnecessary actions to obtain frame id to be able to send a message only to a specific frame without processing same message in other frames. Also with the fact that even if we send a message to all frames - we need to keep in mind that only one response will be processed so we can not just reject or something like that in message handlers.

I'm posting it just in an attempt to draw some more attention to this old issue that makes all of us to use workarounds

@jpc-ae
Copy link

jpc-ae commented Mar 18, 2024

To bump this up, here is my use case:

When a user sends a shortcut command to my extension, no information at all about the active frame/selection/etc... so it's impossible to know what frame to execute a script into. This is understandable behaviour since the shortcut could be hit without anything marked, but in order to handle the request properly I currently need to try and drill down through frames checking for .contentWindow all the way. The problem happens as soon as I hit a cross-origin frame. Since the content script is running inside the page's context, the extension's site permissions do not apply and it's not possible to break through.

What I need is to have the frameID of the element blocking the content script so I can inject directly into the frame instead. The only way of doing that currently (it seems) is to inject message listeners into all frames on a page or Try and spy the frame using webNavigation. The first approach is convoluted and the second is not 100% reliable incase of duplicate frames as well as currently requiring additional permissions that aren't necessary for the extension to function otherwise. I'm currently handling it with optional permissions, but I'd really like to require as few permissions as possible and getting the frameID of cross-origin frames would make that much, much easier. Since even Safari's on board, I don't see why Chrome should lag behind on this.

@sublimator
Copy link
Member

@oliverdunk

I wonder if Google would feel more comfortable with documentId ?

Something along the lines of:

function getTarget(target: Window): Promise<{tabId: number, documentId: string}>

Given it's already a random string? That would likely suffice for most people. It's relatively simple to trade up for a frameId if you really need one.

@sublimator
Copy link
Member

@oliverdunk ?

@sublimator
Copy link
Member

Of course that's only going to work for MV3, but I guess at this point, that's fine?
Apple/Mozilla went ahead with event pages?
Where can I find the status on that?

@Rob--W
Copy link
Member

Rob--W commented Apr 30, 2024

@sublimator To find the permalink, link to the file instead of the pull request. Github had automatically collapsed the diff in the PR because of its large size.

* **[Issue 12](https://github.com/w3c/webextensions/issues/12): request: allow to retrieve a frameID from an &lt;iframe> element**
* [devlin] There were some concerns about exposing the frameId directly to the web page, due to the amount of information captured by the ID. I'd like to follow up with security and follow up.
* [simeon] Issue is being tracked at https://issues.chromium.org/issues/40200404
* [oliver] A recent comment requested documentId as an alternative, which would not have the same issue.
* [oliver] I'll follow up with Chrome Security about this

Of course that's only going to work for MV3, but I guess at this point, that's fine?

Why would this only work with MV3?

Apple/Mozilla went ahead with event pages?
Where can I find the status on that?

What kind of status are you looking for? There is an issue called "Limited event pages" and also another one about neutral background pages, with relevant discussion.

@sublimator
Copy link
Member

Thanks @Rob--W

Thanks for correcting me as it was not my intent to mislead anyone there :) I spoke too carelessly. I was thinking you couldn't use documentId with mv2 apis. I had only checked the chrome.tabs.executeScript api which didn't seem to support it.

chrome.tabs.sendMessage is the important api, and it seems it's more a matter of chrome(or Safari/FF etc) version ? It supports documentId even in MV2, on newer browsers?

https://developer.chrome.com/docs/extensions/mv2/reference/tabs#method-sendMessage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or change to an existing feature follow-up: chrome Needs a response from a Chrome representative implemented: firefox Implemented in Firefox implemented: safari Implemented in Safari topic: frame targeting
Projects
None yet
Development

No branches or pull requests