Captured Surface Control #962

eladalon1983 · 2024-06-04T19:05:28Z

Uryyb GNT! V nz n ohqqvat pelcgbtencul rkcreg.

I'm requesting a TAG review of Captured Surface Control.

Summary

We introduce a new Web API that allows Web applications to:

Read and write the zoom level of a captured display surface (tab or window).
Produce wheel events in a captured tab or window.

Details

Explainer: https://github.com/screen-share/captured-surface-control/blob/main/README.md
Specification URL: https://screen-share.github.io/captured-surface-control/
User research: N/A
Security and Privacy self-review: https://github.com/screen-share/captured-surface-control/blob/main/questionnaire.md
GitHub repo: https://github.com/screen-share/captured-surface-control
Primary contacts (and their relationship to the specification):
- Elad Alon (@eladalon1983), Google
- Tove Petersson (@tovepet), Google
- Guido Urdaneta (@guidou), Google
Organization(s)/project(s) driving the specification: Google
External status/issue trackers for this feature:
- ChromeStatus: https://chromestatus.com/feature/5092615678066688

Further details:

I have reviewed the TAG's Web Platform Design Principles
The group where the incubation/design work on this is being done (or is intended to be done in the future): Screen Capture Community Group and WebRTC Working Group
The group where standardization of this work is intended to be done: WebRTC Working Group
Existing major pieces of multi-implementer review or discussion of this design: https://www.w3.org/2024/05/21-webrtc-minutes.html
Major unresolved issues with or opposition to this design: N/A
This work is being funded by: Google

martinthomson · 2024-09-03T22:32:55Z

@jyasskin, @hober, and I discussed this today.

Thank you for bringing this to us. We think this seems like a generally useful feature, but we have some questions and suggestions for the explainer:

The explainer should discuss the alternative design of having the page cooperate, and accept dedicated events from the capturing process. We think there are both upsides and downsides to that option that deserve exploration.

The two interactions that are considered are scrolling and zooming. Is that list exhaustive? Are these uniformly safe to do? Are there not occasions where scrolling results in changes to things like form elements? That could require a change of focus before sending the events in, maybe, though with precise X and Y on events, that might still engage the element that is targeted. We're inferring that this is limited to those two actions because "spoofing" those events is safe, but the explainer doesn't give enough details to show that that's true.

There seems to be some heightened permissions UX being contemplated here. It's not clear to us what would be different from a regular screen capture. It would be helpful if the explainer could show a proof of concept that highlights those differences.

eladalon1983 · 2024-10-16T12:58:53Z

Apologies for taking some time here. I'll respond soon.

eladalon1983 · 2024-10-22T14:42:31Z

We think this seems like a generally useful feature

That's great to hear!

The explainer should discuss the alternative design of having the page cooperate, and accept dedicated events from the capturing process. We think there are both upsides and downsides to that option that deserve exploration.

I have now added a discussion select alternatives to the explainer.

The two interactions that are considered are scrolling and zooming. Is that list exhaustive?

For the time being - yes.

Apple's represenative, Youenn, suggested adding pinch. No Web developers have requested this feature, so we are leaving this as a potential extension. But note that the current API shape does not prevent such future extensions.

We could, in the future, define forwardPinch(element).
We could, in the future, transition to forwardGestures(element, gestures), where the second argument is a dictionary of relevant gestures.
Other API shapes would be possible.

Note: We intentionally exclude any interaction like clicking, delivering keystrokes, etc. We have no plans of ever extending the API to cover such gestures.

Are these uniformly safe to do? Are there not occasions where scrolling results in changes to things like form elements?

Web applications can attach any meaning to any user action, and that property is desirable and necessary to retain - the user expects scrolling to work identically when delivered from the capturing application; always, not just when it's a simple scroll. A concrete example is Google Maps, where scrolling results in change the region of the map being displayed, triggering the fetching of new assets, etc. Or think how Apple's main page often uses fancy animations of laptops opening and closing when scrolling.

We believe that this risk is sufficiently mitigated by the (1) pre-existing safeguards associated with screen-sharing to begin with, by (2) the additional permission prompt involved, and (3) by the steps taken to ensure that only the user's immediate interaction with the capturing application can trigger scroll-forwarding to the captured application.

That could require a change of focus before sending the events in

Mandating change of focus could break the experience for the user and subvert their expectation, that the scroll delivered on the capturing application's preview tile, would end up eliciting the exact same behavior on the captured surface, as though it were delivered directly there.

There seems to be some heightened permissions UX being contemplated here. It's not clear to us what would be different from a regular screen capture. It would be helpful if the explainer could show a proof of concept that highlights those differences.

When users are currently asked to grant permission to capture a tab/window/screen, they are used to a specific interpretation. Before elevating this permission to something new - capture plus scroll plus zoom - an additional prompt is required. User agents are free to infer this heightened permission using any heuristic, and may change that based on how user expectations evolve over time. For the time being, Chrome intends to use a run of the mill permission prompt, and to use some extra UX to clarify to the user that this permission is active. This is neither mandated by the spec, nor do we guarantee that Chrome will retain this particular UX.

matatk · 2024-10-30T08:40:24Z

Thank you for your reply and all the info in the Explainer. We discussed this on our breakout today.
We still feel the explainer needs more information on possible abuse cases and a bit more discussion of attack surface. The security considerations talks about potential confusion, but doesn't talk about how the API could be abused by bad actors. So we recommend a security analysis (and there is a W3C process spinning up for this) but in the mean time if you could bolster the current security considerations doc with some discussion of abuse cases and mitigations that would great.
As there's a lot going on UI-wise here, we'd really like to see an 'Accessibility considerations' section in the Explainer (it's totally fine to use this section to show what the positives are) - please could you add one? Please also consider requesting a review from the APA WG: https://github.com/w3c/a11y-request/issues/new/choose

eladalon1983 · 2024-10-30T17:38:23Z

We still feel the explainer needs more information on possible abuse cases and a bit more discussion of attack surface.

I have now added a "Security and Privacy Considerations" section in the explainer. It simply links to the corresponding section in the spec, where this information actually lives, so as to avoid duplication.

but in the mean time if you could bolster the current security considerations doc [Emphasis mine - Elad.]

Do I understand correctly, that you are asking for the information already in the spec (this section) to be replicated in questionnaire.md? I think it would be better to go with linking; maybe from section 2.18 to the spec's "Security and Privacy Considerations" section. Wdyt?

As there's a lot going on UI-wise here

Could you please clarify which UI changes you are referring to? As far as I can tell, this spec does not deal with anything UX-related. Although bespoke user agent UX associated with these APIs is possible, this is completely up to the UA's discretion; a spec-compliant implementation is possible even without any additional user agent UX.

To clarify, this mock is of the Web application's possible UX, not the user agent's.

jyasskin · 2024-10-30T19:43:18Z

FWIW, I don't think you should duplicate any information into https://github.com/screen-share/captured-surface-control/blob/main/questionnaire.md. Instead, questionnaire.md should include links to the places in the specification that answer the questions. We should improve the questionnaire and template to say that. I wasn't in the relevant breakout, so I don't want to comment on the other questions.

matatk · 2024-11-20T16:46:46Z

@eladalon1983:

Could you please clarify which UI changes you are referring to? As far as I can tell, this spec does not deal with anything UX-related. Although bespoke user agent UX associated with these APIs is possible, this is completely up to the UA's discretion; a spec-compliant implementation is possible even without any additional user agent UX.

To clarify, this mock is of the Web application's possible UX, not the user agent's.

Totally agree that we generally aim to avoid specifying UI/UX, and ACK that the UI in the example is from the app (and, of course, UI is already covered by WCAG - though I'll come back to that). Let me hopefully clarify...

Whilst a spec may be for a low-level API, products built with the API are often user-facing. Developers building things with the API may not imagine some of the ways users could be using them; it can be helpful to raise awareness of the opportunities, and any risks, and makes sense to do that in the spec itself.

A concrete and helpful example of some big accessibility wins, and some patterns to avoid, can be found in the Compute Pressure API's Accessibility Considerations section. This example is great because it shows how the API can affect users (UI decisions being made based on its output), how this can help users, and also the importance of meeting, but thinking beyond WCAG in a particular domain.

In the case of Captured Surface Control, there is a new avenue through which to interact with the preview, and a new avenue to scroll and zoom the target tab. As an extensive sample of one vision-impaired people, this seems like a helpful thing to me :-). I am not 100% sure how/if focus considerations would come into play (focusing the PiP window is likely out of scope, but would you expect there to be interactive controls floating within it?) It'd be great to read your thoughts on this in the explainer.

APA WG would be happy to follow the development of this API—please consider requesting a review, or tagging APA WG via the "a11y-tracker" label in any issue where you think some input may be of help.

matatk · 2024-12-20T15:52:11Z

Thanks again @eladalon1983 for your review request, and the updates you've made to the explainer. We discussed Captured Surface Control again this week, and are happy to close this review as satisfied.

eladalon1983 added the Progress: untriaged label Jun 4, 2024

plinss added this to the 2024-08-19-week milestone Aug 19, 2024

guidou mentioned this issue Aug 22, 2024

Captured Surface Control WebKit/standards-positions#388

Open

torgo added Venue: WebRTC WebRTC and media capture and removed Progress: untriaged labels Aug 29, 2024

torgo modified the milestones: 2024-08-19-week, 2024-09-02-week Aug 29, 2024

torgo assigned martinthomson and maxpassion Aug 29, 2024

torgo added Focus: API design (pending) Focus: Security (pending) labels Aug 29, 2024

martinthomson added the Progress: pending editor update TAG is waiting for a spec/explainer update label Sep 3, 2024

torgo modified the milestones: 2024-09-02-week, 2024-09-09-week Sep 5, 2024

plinss removed this from the 2024-09-09-week milestone Sep 16, 2024

torgo added this to the 2024-10-07-week milestone Oct 4, 2024

torgo modified the milestones: 2024-10-07-week, 2024-10-14-week Oct 11, 2024

plinss removed this from the 2024-10-14-week milestone Oct 21, 2024

torgo added this to the 2024-10-28-week milestone Oct 25, 2024

plinss removed this from the 2024-10-28-week milestone Nov 4, 2024

jyasskin added Progress: propose closing we think it should be closed but are waiting on some feedback or consensus and removed Progress: pending editor update TAG is waiting for a spec/explainer update labels Nov 5, 2024

plinss added this to the 2024-11-18-week milestone Nov 5, 2024

plinss modified the milestones: 2024-11-18-week, 2024-12-16-week Dec 16, 2024

matatk added Resolution: satisfied The TAG is satisfied with this design and removed Progress: propose closing we think it should be closed but are waiting on some feedback or consensus labels Dec 20, 2024

matatk closed this as completed Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Captured Surface Control #962

Captured Surface Control #962

eladalon1983 commented Jun 4, 2024

martinthomson commented Sep 3, 2024

eladalon1983 commented Oct 16, 2024

eladalon1983 commented Oct 22, 2024 •

edited

Loading

matatk commented Oct 30, 2024

eladalon1983 commented Oct 30, 2024 •

edited

Loading

jyasskin commented Oct 30, 2024

matatk commented Nov 20, 2024

matatk commented Dec 20, 2024

Captured Surface Control #962

Captured Surface Control #962

Comments

eladalon1983 commented Jun 4, 2024

Summary

Details

martinthomson commented Sep 3, 2024

eladalon1983 commented Oct 16, 2024

eladalon1983 commented Oct 22, 2024 • edited Loading

matatk commented Oct 30, 2024

eladalon1983 commented Oct 30, 2024 • edited Loading

jyasskin commented Oct 30, 2024

matatk commented Nov 20, 2024

matatk commented Dec 20, 2024

eladalon1983 commented Oct 22, 2024 •

edited

Loading

eladalon1983 commented Oct 30, 2024 •

edited

Loading