Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are platform object references common across sandboxes #176

Open
jgraham opened this issue Feb 24, 2022 · 5 comments
Open

Are platform object references common across sandboxes #176

jgraham opened this issue Feb 24, 2022 · 5 comments

Comments

@jgraham
Copy link
Member

jgraham commented Feb 24, 2022

Started discussion offline:
Should the ObjectIDs for JS Node wrapper objects (and other DOM objects) be consistent in different sandboxes?

  • yes, as user can use those nodes both in page realm and in sandbox.
  • no, as the JS wrapper objects are different and pretending they are the same (share the same objectId) seems odd.

Potential solution for that would be excluding DOM instances from the serialization and consider serialization only from the ECMAScript realm perspective. Having that, another way to describe the node is needed.

Another solution is to add something like domID to DOM objects so that they can be referenced when needed across sandboxes.

Implementation-wise in CDP we face the need of somehow merging domains (Runtime (V8) runs evaluate but cannot serialize the result, as it doesn't know of DOM). Some of the CDP implementation approaches can be found in draft document BiDi Serialization in CDP.

Having that, we'd consider split DOM instances from it's JS wrappers, leaving the last in default serialization. And introducing DOM-specific serialization like describeDomObjects or decribeNodes.

Originally posted by @sadym-chromium in #158 (comment)

@jgraham
Copy link
Member Author

jgraham commented Feb 24, 2022

So, as I understand it, the problem is that in CDP a RemoteObjectId is specific to the wrapper object. So if you have something like a Node which can have multiple wrapper objects in different isloated worlds, the calling code has to get an object id, and then know the result is expected to be a node and call Dom.describeNode to get back a NodeId that can then be exchanged for an object id in a different realm by calling resolveNode?

It looks like Puppeteer handles this logic on its own, relying on the fact that it knows which commands are running in an isolate, and which are going to return nodes e.g. waitForSelector, which calls _adoptElementHandle, which does this complex dance.

From a user point of view pushing all this complexity onto end users and client authors seems suboptimal. It requires the user to know when a script can return a node and implement special handling of the return value for that case. Of course that's not easy to do in general (it's presumably the exact same problem that you're running into here). It's arguably a regression in functionality compared to WebDriver-classic where arbitary scripts can return Elements including as part of a larger data structure, without the client having to do anything special (the serialization is very limited, and we aren't providing any details, but you can at least be sure that you have an Element).

I think we should try quite hard to avoid exposing this implementation detail of Blink as part of the protocol. For comparison, whilst Gecko obviously has C++-implemented objects and JS wrappers, we don't have this 1:many relationship so it's actually more work for us to give objects different ids in each sandbox.

@sadym-chromium
Copy link
Contributor

sadym-chromium commented Mar 9, 2022

The scenario I think of is the following:

  1. Get serialized Node from main context
  2. Create Sandbox
  3. Access Node wrapper from sandbox (DOM.describeNode converts between different IDs)
  4. Pass the ID to callFunctionOn as an argument in main context and in sandbox

Thoughts about that:

  1. We want to be able to share Node (especially Element) references between sandboxes
    2. All browsers have JS wrapper objects for anything defined by Web IDL
    UPD: not all.
  2. BiDi spec currently suggests to put JS wrapper and node IDs into a single objectId.
  3. We want to avoid confusion from the user side, when they can share some ObjectIds (Node), and not others (JS Array) across sandboxes. And to avoid confusion of different JS wrapper in different sandboxes having the same IDs.
  4. We (Chromium) want the specification can be implementable based on CDP.

Potential solutions I can see:

  1. Use different format for JS objects ID vs DOM object ID and differentiate them on the server side.
  2. Keep a big ID map with all the DOM wrappers.
  3. Introduce a new field DomID or NodeID, which will be consistent between sandboxes.

Options [1] and [2] tries to improve the usability by hiding the difference between JS and DOM instances. But it can cause more confusion.
Option [3] makes users aware of the difference between DOM instances and it's wrappers, but it seems to be more logical.

@jgraham
Copy link
Member Author

jgraham commented Mar 9, 2022

We want to avoid confusion from the user side, when they can share some ObjectIds (Node), and not others (JS Array) across sandboxes. And to avoid confusion of different JS wrapper in different sandboxes having the same IDs.

Note that this is an implementation detail. In particular gecko doesn't have different wrapper objects per sandbox. Instead the properties you can access are filtered according to the context you access them from (I don't think this is a correct model of the implementation but you can roughly imagine that wrapper properties are double-keyed as (sandbox name, name)). So for us, having to pretend there are different objects ids for different sandboxes would be additional work that wouldn't match the actual implementation.

To me exposing the difference between a DOM id and an Object id seems to be pushing complexity onto users. I'd prefer that we start by just having an object id that (for DOM objects, but not JS objects) is useful across sandboxes, and if we later need to seperate out the wrapper object we give that a seperate id.

@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Are platform object references common across sandboxes.

The full IRC log of that discussion <AutomatedTester> topic: Are platform object references common across sandboxes
<AutomatedTester> github: https://github.com//issues/176
<AutomatedTester> jgraham: THis is a continuation of a discussion that started in the editorial meeting
<AutomatedTester> ... if I have a scrip that executes in a sandbox and it reutnrs a DOM node you will get an object ID
<AutomatedTester> ... what happens when we try access that object ID in a different sandbox?
<AutomatedTester> ... Implementation wise: the way that is implmented in sanboxes in blink/webkit they create a wrapper object
<AutomatedTester> ... and the object id is linked to a wrapper object
<AutomatedTester> ... gecko is different
<AutomatedTester> ... I don't think that thinking about this in implementation properties is correct, we should focus on what our users want
<AutomatedTester> ... we could have a different wrapper for each object id and just try make it work in gecko
<AutomatedTester> ... [ describes DOM nodes being used across sandboxes in CDP]
<AutomatedTester> ... from a user point of view its a lot work that doesn't match how webdriver tries to support users
<AutomatedTester> ... one proposal: we could return 2 ID, the sandbox ID and a node ID . THe node ID is constant across sandboxes
<foolip> q+
<AutomatedTester> ... I guess... to what extent do we want to expose to our users. [describes sharing NodeID in different ways]
<AutomatedTester> ack foolip
<sadym> q+
<AutomatedTester> foolip: question about wrapper objects in gecko? If you implement serialisation in gecko... do you use the wrapper object id or...
<AutomatedTester> ... in v8 we would need to treat certain objects differently. If a wrapper goes out of scope it will be GCed
<AutomatedTester> jgraham: I am not sure about this in gecko but I don't think it's a problem for us
<AutomatedTester> ... I would be surprised if this was an issue for gecko as we would have hit it with Marionette in the past
<AutomatedTester> foolip: how does it work? e.g. never GC wrapper objects and then just use the underlying object?
<AutomatedTester> jgraham: the relevant point here is that Marionette is written in JS so it's not like we have c++ code that gives us extra info. We only have access to the wrappers
<AutomatedTester> q?
<AutomatedTester> ack sadym
<AutomatedTester> sadym: what happens when the user tries to access an object thats not in the sandboxx? we error?
<AutomatedTester> jgraham: I am not sure how this would work in gecko. It's a great question
<AutomatedTester> ... if it's not a platform object you shouldn't be able to move between JS sandboxes
<AutomatedTester> ... [describes objects that can/can't be shared]
<AutomatedTester> sadym: [describes objects that can/can't be shared]
<AutomatedTester> jgraham: as long as people can understand which different objects, via properties, can be shared and if it's not a platform object then it can't be shared
<AutomatedTester> ... if we were just returning an opaque handle to an object, like CDP, then you can't just share this between objects
<AutomatedTester> sadym: Our suggestion here is to add an exxtra field to explain what type of object is being returned
<AutomatedTester> ... another question: what it make a difference if we use node id that can be shared between all sandboxes while object id would be to that one sandbox
<AutomatedTester> jgraham: I think that would but I think it might create some invariance to users
<foolip> q+
<AutomatedTester> ... [jgraham describes window proxy needing both]
<AutomatedTester> ... and my main concern is that if we only returned opaque handles
<AutomatedTester> ... if we are going to add properties then that would be good
<AutomatedTester> whimboo: I haven't looked at this for a while so would need to check
<AutomatedTester> sadym: so to check to add other properties is what you want?
<patrickangle> q+
<AutomatedTester> jgraham: yes, that way you won't need out of band info
<AutomatedTester> q?
<AutomatedTester> ack foolip
<brwalder> q+
<AutomatedTester> foolip: on the window situation, they are unique as they are global in nature
<AutomatedTester> ... about the 2 fields proposal. That seems good but what guarantees would we want to make sure they are differentiable?
<AutomatedTester> jgraham: we should make it so that are guaranteed to be different. We need to check gecko but at worst case we can have double maps to do this
<AutomatedTester> q?
<jgraham> s/doubble maps/double keyed object maps/
<AutomatedTester> ack patrickangle
<AutomatedTester> patrickangle: I am not sure we need two ideas, we could just have a boolean to say this would survive sandboxes. that way people wouldn't need to try figure out which id is important to them
<AutomatedTester> sadym: We have some fields that ready, e.g. window proxy. In blink it would make it hard for users to see the differences
<AutomatedTester> jgraham: I don't mind either solution or a wrapper id. I think we just need a way to help users
<AutomatedTester> q?
<AutomatedTester> ack brwalder
<AutomatedTester> brwalder: I don't think we have discussed this yet but we have generally agreement about what is moved about
<AutomatedTester> ... how do we then pass the returned object to `executeScript`?
<AutomatedTester> sadym: there are a few approaches
<jgraham> q+
<AutomatedTester> ... we check if the object id then use it if it is there
<AutomatedTester> foolip: we just pass back the serialised object back
<AutomatedTester> brwalder: so sadym, to make sure that I understand we just check if its in the sandbox and use it if there
<AutomatedTester> sadym: yes
<AutomatedTester> q?
<AutomatedTester> ack jgraham
<AutomatedTester> jgraham: I understood the question differently, e.g. what would the API would look like
<brwalder> q+
<AutomatedTester> ... we have 1) you pass in UUID and check the node/object ID
<AutomatedTester> ... 2) we do what CDP does re: object ID and you pass what you need in
<AutomatedTester> ... for simple objects its good but it can get very complex
<AutomatedTester> ... 3) we have a type field and it describes the node
<AutomatedTester> ... I don't have a preference
<AutomatedTester> q?
<AutomatedTester> ack brwalder
<AutomatedTester> brwalder: I think agree with you. I prefer options 2/3 of what you said
<AutomatedTester> ... 1 has a potential perf hit
<AutomatedTester> ... and option 2 is better than option 3
<sadym> q+
<AutomatedTester> jgraham: the only time would be 3 better than 2 is that matches the rest of the spec
<AutomatedTester> ack sadym
<AutomatedTester> sadym: One of the options between 2/3 is that we make sure what is returned to the user and what they send back can be serialised
<AutomatedTester> jgraham: I think it would work for everything unless it's a node id
<AutomatedTester> ... I think we're always going to need to construct new objects
<AutomatedTester> sadym: if we have a node id return that else a object id
<AutomatedTester> q?
<AutomatedTester> actions: A PR for the spec change on this can be created

@sadym-chromium
Copy link
Contributor

sadym-chromium commented Mar 9, 2022

I can make the serialization/deserialization spec part regarding what we just discussed

UPD: Define CrossSandboxReference deserialization #180

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants