Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detectability of assistive technology #1371

Closed
ShivanKaul opened this issue Dec 16, 2020 · 38 comments
Closed

Detectability of assistive technology #1371

ShivanKaul opened this issue Dec 16, 2020 · 38 comments
Assignees
Labels
1.3-Blocking Blocking issues for 1.3 WRWD Agenda privacy-needs-resolution Issue the Privacy Group has raised and looks for a response on.
Milestone

Comments

@ShivanKaul
Copy link

ShivanKaul commented Dec 16, 2020

I reviewed https://raw.githack.com/w3c/aria/2020-09_CR/index.html for privacy issues as requested by the ARIA WG. Currently the document does not have a Privacy Considerations section.

It seems that aria-hidden=true can be used in combination with focus or mousedown/up JS events to detect whether or not the user of a website is browsing using a screen reader. Here's a demo website: https://dylanb.github.io/screenreaderdetection. I believe a website developer could use role=none/presentation similarly.

I'm curious as to why there is no text around this in the document. Seems like it is a widely-recognized privacy harm [0, 1] and something that could be discussed along with possible mitigations [2] in a Privacy Considerations section.

@cookiecrook
Copy link
Contributor

cookiecrook commented Dec 16, 2020

My results indicate this is not as straightforward as the demo indicates. It does catch a few real instances, but it's also wrong a lot of the time.

  1. Keyboard Nav on Mac: "keyboard-only user on Apple device, screen reader might be present" (Screen reader was not active on my Mac.)
  2. VO on iOS with default activation: noted as "Screen reader user on OS X"
  3. VO on iOS with pass-through: noted as "mouse/touch user on Apple device"
  4. VO on macOS with default activation: noted as "Screen reader user on OS X"
  5. VO on macOS with click activation: noted as "mouse/touch user on Apple device"
  6. Keyboard Nav on iOS noted as "Screen reader user on OS X" (VoiceOver was off)

I'm pretty sure I could get it to fail in a number of other circumstances and on other platforms, but I think this enough for now.

This approach is also not limited to ARIA. You can use a similar approach with plain old HTML+CSS and get similar faulty but sometimes correct results.

@jnurthen
Copy link
Member

Indeed the demo site shown will show exactly the same issues if you remove the aria-hidden. The only difference is that a very savvy screen reader user might notice that buttons were doubled up.

This is not to say that there are not potential issues - however from this demo I'm not convinced that they come from ARIA and not from HTML and CSS. Indeed the mitigations discussed in https://bugzilla.mozilla.org/show_bug.cgi?id=988896 have nothing to do with ARIA and show themselves with a standard HTML button.

@JAWS-test
Copy link
Contributor

JAWS-test commented Dec 17, 2020

If I understand the test page correctly, the cause is the screen reader, because screen readers convert keyboard events into click events. I have tested Firefox with JAWS and NVDA and in both cases the screen reader is recognized correctly. In JAWS I can set that the conversion of events should not take place and then JAWS is also no longer recognized. In NVDA, the problem does not occur when I switch to forms mode, because then the events are not converted either.

In this respect it is a screen reader problem. In my opinion it has nothing to do with ARIA

@ShivanKaul
Copy link
Author

@cookiecrook - point taken about the detector script not working for every combination, but from your notes it does seem like it mostly identifies the default case - I'd imagine that this would be most users?

@jnurthen - agreed that aria-disabled would simply prevent the user from even knowing that there was a strange second button when this detection attack was carried out. But having it makes the attack more passive, which I think is not great. Also, I haven't gotten too much time to look into it, but could a potential attack for the screen reader vs keyboard-user case also be having 3 elements on a page, the middle one aria-hidden, and if there is a focus event on the first and last but not the middle aria-hidden one then that is evidence of a screen reader being used?

In general, the ability for a server to turn off/modify ARIA use by the browser/screen reader such that the user's behaviour is measurably (by the page) modified is a problem, and while I haven't gotten time yet to fully explore all the different attacks possible, it does seem worthwhile to think and call out the potential for them in the ARIA spec.

@JAWS-test
Copy link
Contributor

JAWS-test commented Dec 17, 2020

@ShivanKaul Everyone has written so far that it has nothing to do with ARIA. The test page takes advantage of a feature of screen readers. The example at http://dylanb.github.io/screenreaderdetection/ would also work in the same way without ARIA

@carmacleod
Copy link
Contributor

In case it's helpful, 2 additional articles [3, 4] I had saved on why detection is a bad idea.

@samuelweiler samuelweiler added the privacy-needs-resolution Issue the Privacy Group has raised and looks for a response on. label Dec 17, 2020
@samuelweiler
Copy link
Member

@ShivanKaul Everyone has written so far that it has nothing to do with ARIA. The test page takes advantage of a feature of screen readers. The example at http://dylanb.github.io/screenreaderdetection/ would also work in the same way without ARIA

Should this not still be documented here?

@ShivanKaul
Copy link
Author

The example at http://dylanb.github.io/screenreaderdetection/ would also work in the same way without ARIA

I was under the impression that the aria-hidden is what causes the simulated button to not show up for screen reader users. If yes, IMO this is a problem because screen reader users won't even come to know that there is a weird button here. In general, we would prefer attacks be active (detectable by the user), not passive. FWIW I don't think it's a serious enough issue to hold up publication.

@jnurthen jnurthen added this to the ARIA 1.3 milestone Dec 17, 2020
@jnurthen jnurthen added the 1.3-Blocking Blocking issues for 1.3 WRWD label Dec 17, 2020
@jyasskin
Copy link
Member

In the PING call today, I suggested that it'd be good to have a Privacy Considerations section in ARIA 1.2 saying something like,

The fact that someone is using assistive technology is sensitive information that would ideally be hidden from webpages. Unfortunately, many aspects of the web platform reveal that information, and many web pages rely on it in order to provide an acceptable experience for people using assistive technology. Within the ARIA spec, aria-hidden, X, Y, and Z can be used to detect that assistive technology is in use.

Or whatever y'all think is accurate.

There's no need to figure out how to make screen readers and non-screen-readers indistinguishable in ARIA 1.2, as that's a much bigger project.

@cookiecrook
Copy link
Contributor

Within the ARIA spec, aria-hidden, X, Y, and Z can be used to detect that assistive technology is in use.

I would definitely object to this phrasing... It's inaccurate and alarmist.

@cookiecrook
Copy link
Contributor

cookiecrook commented Dec 17, 2020

By that phrasing, you might also need to say "In the HTML Spec, <button>, < a href>, and features x, y, and z can be used to detect assistive technology. In the CSS spec, the position and visibility properties can be used to detect assistive technology." I don't think that's accurate, but there are observable differences across all aspects of the web platform, and each point might be used to infer a higher or lower probability that "AT might be in use"... Listing the exact heuristics in the specs seems like it would only be useful as a checklist for malicious actors.

Perhaps we should have an ARIA/PING joint meeting to discuss.

@cookiecrook
Copy link
Contributor

Not to beleaguer the point, but does mousemove list that it can be used to differentiate left-handed versus right-handed users? Should it? Without a clear path to resolve it, it seems the only benefit to listing it in the spec would be scare users and give more ideas to malicious fingerprinting engineers.

@cookiecrook
Copy link
Contributor

FYI @alice @LJWatson @jcsteh

@JAWS-test
Copy link
Contributor

Within the ARIA spec, aria-hidden, X, Y, and Z can be used to detect that assistive technology is in use.

There would first have to be an example where this is the case. The current example relies on an invisible button being activated by keyboard users with keyevent and by screen reader users with mouseevent. Mouse users can't activate it because it was hidden by CSS. ARIA is not in play at all. The warning should correctly read like this: If you use a screen reader that converts events, it may be discovered that you have a screen reader

@carmacleod
Copy link
Contributor

Also FYI @MarcoZehe

@jyasskin
Copy link
Member

I wouldn't want the spec to say anything that the WG thinks is inaccurate, but I do feel like it's accurate to say that the bits of both ARIA and CSS that make it possible to show different content to screen-reader users vs non-users, are potential privacy problems.

It's true that we'd hit diminishing returns in trying to list every feature that can probabilistically distinguish between the two sets of users, and in the long run, I'd like to find a way for the privacy threat model to describe the boundary between "attacks" that are worth blocking vs ones that we can't block.

+1 for a joint meeting to figure out a rough boundary to drive Privacy Considerations sections for specs in the near term.

@jnurthen
Copy link
Member

@jyasskin I'm assuming you don't want to try to schedule this this year? Early Jan ok?

@jyasskin
Copy link
Member

I defer scheduling to @pes10k and Christine (whose github handle I've lost), but I also assume not this year.

@JAWS-test
Copy link
Contributor

JAWS-test commented Dec 18, 2020

but I do feel like it's accurate to say that the bits of both ARIA and CSS that make it possible to show different content to screen-reader users vs non-users

That is not true. It has nothing to do with ARIA or CSS. JavaScript is enough. I don't need to hide the visible button via aria-hidden. There would be many other possibilities, e.g. a font graphic with alt="" or a CSS graphic that is not output by the screen reader.

Two things are crucial:

  • Screen readers that convert events.
  • JavaScript, which I use to detect that different events have been triggered.

@cookiecrook
Copy link
Contributor

cookiecrook commented Dec 18, 2020

@JAWS-test wrote:

Two things are crucial: [AT and JavaScript…]

You don't even need JavaScript. You can position a link off screen (most easily with CSS) to prevent mainstream pointer activation, and add tabindex="-1" to prevent non-AT focus activation. This results in a reasonably high confidence that anything activating the link will be either an AT user, or a bot. Though JavaScript certainly makes it even easier.

@jyasskin wrote:

I wouldn't want the spec to say anything that the WG thinks is inaccurate, but I do feel like it's accurate to say that the bits of both ARIA and CSS that make it possible to show different content to screen-reader users vs non-users, are potential privacy problems.

Thanks for raising this comment Jeffrey, and thanks @ShivanKaul for raising the original issue. I agree that there are privacy risks with many features of the Web Platform. As @JAWS-test points out, most of these are related to variance in JavaScript event objects and timing. Although JavaScript events are the core of the issue, they can be used in conjunction with CSS, ARIA, and other HTML features to increase heuristic confidence.

The W3C should shine a light on any risk area, including those related to assistive technology (AT). I think it's reasonable to include some note on privacy risk in the ARIA spec, as long as we're clear that similar notes about AT-specific privacy risk should be added to HTML, CSS, and almost every other W3C spec that defines UI and/or interaction.

I look forward to discussing this more in a joint working group meeting in 2021. Happy New Year!

@LJWatson
Copy link
Contributor

LJWatson commented Dec 18, 2020

+1 to the suggestion that ARIA include a privacy section, and to suggest that principle 2.7 from the TAG WebPlatform Design Principles is referenced.

@pes10k
Copy link

pes10k commented Jan 19, 2021

Hi all, just wanted to see if the WG is still interested in a joint call with PING. Just to get a sense of the room, if a call appeals please give a 👍 (or if you're opposed a 👎) to this comment. If there is interest, I'm happy to talk with the chairs and figure out a good time for a call. (thanks for the nudge @cookiecrook !)

@jnurthen
Copy link
Member

jnurthen commented Feb 4, 2021

@pes10k The ARIA group would like to propose 9am pacific noon eastern on Thursday Feb 11 for this discussion.

@ShivanKaul
Copy link
Author

9 am Pacific Feb 11 works well for me too.

@pes10k
Copy link

pes10k commented Feb 9, 2021

9am Feb 11 also works for me

@jnurthen
Copy link
Member

jnurthen commented Feb 9, 2021

OK - I have scheduled a meeting in Zoom.
Topic: ARIA and PING discussion
Time: Feb 11, 2021 09:00 AM Pacific Time (US and Canada)

Join Zoom Meeting
https://mit.zoom.us/j/97458924977

Meeting ID: 974 5892 4977

US : +1 646 558 8656 or +1 669 900 6833

IRC: #aria-dive

@cookiecrook
Copy link
Contributor

Actions from yesterday's meeting as I recall them:

  • @jyasskin (other another PING member?) to start researching how to store semi-confidential list of web platform features with AT-dectection risk, like security bugs.
  • @jcraig to file an issue against ReSpec/Bikeshed to add method of marking AT-detection risk areas, similar to the classname for fingerprinting vectors. Note: most of these risk areas are in HTML/DOM/CSS, so this feature would need to work with each those specs and more.
  • @jyasskin to clarify where he thought this task would live (possibly as an addition to the TAG's Web Platform Design Principles 2.7?) "Describe some 'boundaries' like 'if you just visit a page, there's no way to detect AT. If you click, you can probabilistically detect AT.'" cc @cynthia

@JAWS-test
Copy link
Contributor

JAWS-test commented Feb 15, 2021

I can think of several methods to determine AT usage, not 100%, but quite reliably. However, these do not apply to mobile devices/touch operation. There are probably other methods. For desktop devices, however, the following should work:

  • one by one, all input fields but not other form fields are focused but not filled in = screen reader with automatic form mode.
  • All interactive elements are focused one after the other = keyboard users
  • Fast input of words in input fields = use of speech input
  • Keyboard trap cannot be left, application is closed = keyboard user
  • modal pop-up, which does not automatically receive focus, is not operated = keyboard user, screen reader user, possibly also user of a screen magnifier
  • interactive element, which was marked with aria-hidden=true, is not operated when focus is received, but the element, which had the focus before (JAWS) or which is in the source code behind it (NVDA) = screen reader user (this method can even be used to determine if someone is using JAWS or NVDA)
  • certain interactive elements cannot be operated due to wrong ARIA roles = screen reader user
  • certain interactive elements are operated according to their ARIA role or their visual representation (if ARIA role and representation contradict each other regarding operation) = screen reader user, if operation according to ARIA role, seeing keyboard user, if operation according to visual representation
  • Activation of interactive element with keyboard and not with mouse = Keyboard user
  • for all keyboard users: navigation is not always linear from the beginning = screen reader user (using quick navigation)

@cookiecrook
Copy link
Contributor

I took an action today to raise this issue with the TAG, and draft a new Privacy section for 1.3 that includes a note referencing the TAG issue.

@jyasskin
Copy link
Member

jyasskin commented Mar 2, 2021

  • I checked with @samuelweiler, who said the W3C has not maintained a semi-private vulnerability database like this before, but if this would be particularly useful, they're willing to set up it up. We should find someone other than me to coordinate actually setting it up: this issue isn't my focus, so I'm likely to drop important messages.

  • I think the "boundaries" mentioned in Detectability of assistive technology #1371 (comment) should go in https://w3cping.github.io/privacy-threat-model/#model and possibly also in the TAG's documents. I'm planning to write a PR to do that for the privacy threat model.

@cynthia
Copy link
Member

cynthia commented Mar 2, 2021

As-per the principle, even the click shouldn't expose this - but presumably that happens today because it is inevitable. Will have to read the backlog first to understand..

@cookiecrook
Copy link
Contributor

I've filed the Web Platform Design Principles Issue #293 and will reference it in an upcoming PR to add this note to the spec for ARIA 1.3.

@joanmarie
Copy link
Contributor

@ShivanKaul could you please make a version that is compatible in Linux? :)

@jnurthen jnurthen self-assigned this Mar 22, 2021
@cookiecrook
Copy link
Contributor

Unassigning myself b/c James Nurthen is taking this for 1.2.

@jnurthen
Copy link
Member

jnurthen commented Apr 5, 2021

@ShivanKaul @jyasskin @cynthia @pes10k please see the proposed Security and Privacy Considerations section at https://raw.githack.com/w3c/aria/jnurthen/issue1371/index.html#privacy-and-security-considerations.
Please let us know if this is not sufficient.

@jyasskin
Copy link
Member

jyasskin commented Apr 5, 2021

That looks great to me, with a note that identifying use of AT is more-bad than just being part of a fingerprint: the fact is itself sensitive information. So, instead of

This content disparity could be used as a device fingerprinting method for users of Assistive Technologies.

maybe something like

This content disparity can be abused to identify [edit: or "determine"] that a user probably is or is not using Assistive Technologies, which is an invasion of their privacy and can be part of a tracking vector.

@cynthia
Copy link
Member

cynthia commented Apr 6, 2021

The proposed change looks good to me as well. @jyasskin 's proposal also looks good!

Are there any takeaways I should add to the principles regarding this? Should we be explicit that our principle primarily targets new work?

@jyasskin
Copy link
Member

jyasskin commented Apr 6, 2021

@cynthia I think Alice's comment at w3ctag/design-principles#293 (comment) is exactly right, and I think that thread is a better place to discuss the Principles than this one. I don't immediately see a reason that https://w3ctag.github.io/design-principles/#do-not-expose-use-of-assistive-tech would be any more or less targeted at new work than all the other principles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.3-Blocking Blocking issues for 1.3 WRWD Agenda privacy-needs-resolution Issue the Privacy Group has raised and looks for a response on.
Projects
None yet
Development

No branches or pull requests