Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notify Screenreader Users When Inline Suggestions Or Decorations Available #130565

Closed
hediet opened this issue Aug 11, 2021 · 82 comments
Closed

Notify Screenreader Users When Inline Suggestions Or Decorations Available #130565

hediet opened this issue Aug 11, 2021 · 82 comments
Assignees
Labels
accessibility Keyboard, mouse, ARIA, vision, screen readers (non-specific) issues feature-request Request for new features or functionality on-testplan
Milestone

Comments

@hediet
Copy link
Member

hediet commented Aug 11, 2021

This is a follow up to the feedback given in #129677:
Screenreader users are currently not being informed if there are inline suggestions available.

They have to check explicitly if there are inline suggestions available.

We should explore a push-based approach.

However, if there is almost always an inline suggestion (as it is the case with Copilot), too many notifications might get spammy very quickly.


A solution here might also be applied to diagnostics and quick fixes.

@hediet hediet added this to the August 2021 milestone Aug 11, 2021
@hediet hediet self-assigned this Aug 11, 2021
@isidorn isidorn added the accessibility Keyboard, mouse, ARIA, vision, screen readers (non-specific) issues label Aug 11, 2021
@isidorn
Copy link
Contributor

isidorn commented Aug 11, 2021

In general we need a more powerful hidden TextEdit field to make this happen (to make the editor content more rich accessibility wise). I have tried to make this happen in the past, but was unable due to limited API of the contentEditable. More details here: #97154 (comment)

I think this is something we will need in general, however I am not sure if it would be possible to have this with the current DOM apis.


In case that is not possible than we would need a more specific solution for this particular case (like the status bar updating, or a quick announcement)

fyi @zersiax

@zersiax
Copy link

zersiax commented Aug 11, 2021

In case that is not possible than we would need a more specific solution for this particular case (like the status bar updating, or a quick announcement)

The status bar updating doesn't ping the screenreader afaik. So that wouldn't really help much. What do you mean by a quick announcement exactly?

@isidorn
Copy link
Contributor

isidorn commented Aug 11, 2021

@zersiax that we aria live announce whenever there is an inline suggestion. Though this might be too noisy...

@zersiax
Copy link

zersiax commented Aug 11, 2021

Hmm ... mostly out of curiosity, would the same not work for other ornaments? We'd probably need a preference section to dial in what a person wants aria-live-announced at any given point, but it feels to me that if we can detect suggestions and aria announce them, could we not do similar things for the others, granting that this would be quite noisy if you have it on all at once?

@hediet
Copy link
Member Author

hediet commented Aug 12, 2021

What would the aria-live-announcement look like? Would it say something like "(Inline) Suggestions available"? Or would it also include the suggestion text?

Maybe we could also explore playing a super short non-speech sound that could possibly be played in parallel to any text speech output?

@isidorn
Copy link
Contributor

isidorn commented Aug 12, 2021

Since this would be noisy I think it should be as short as possible and not include suggestion text. So that the user is notified but if the user wants to know more Cmd + K, Cmd + I is the way to go. So something like
suggestions, Cmd + K, Cmd + I

I like the idea of super short non-speech sound, even though we try not to do this in general and we leave this to screen readers. We try to provide the context, and we leave the announcing and producing sounds to the screen reader. However I am open for exploring this option.

@isidorn
Copy link
Contributor

isidorn commented Aug 12, 2021

@joanmarie @LeonarddeR is there some way to make audio cues to the screen reader through aria attributes? In this particular case we would like a short "notification sound" to be sent to the user. And there are some other use cases where this could be helpful. Thanks

@joanmarie
Copy link

@isidorn Not in Orca. But could VSCode itself play sounds? (i.e. why does this need to be done through the screen readers?)

@hediet
Copy link
Member Author

hediet commented Aug 12, 2021

i.e. why does this need to be done through the screen readers?

I think it would be nice if various standardized notification sounds were a feature of screen readers:

  • Standardization would allow to have recognizable sounds
  • Those sounds could be customized system-wide (in terms of volume, speed, pitch etc.)
  • VS Code currently does not play any sound by itself, I guess we would need to add a lot of dependencies to play sounds just for this use-case

@LeonarddeR
Copy link

Agreed with @joanmarie. Probably best to do this in Code itself. I believe they planned something like that for full Visual Studio as well.

@Neurrone
Copy link

Agreed that sounds would be the best way to fix this, although I recognize that's more effort than adding an aria alert.

This is for instance, how NVDA notifies users that there are search results available when typing in the start menu. Super short sound that can be easily understood parallel to speech.

@isidorn
Copy link
Contributor

isidorn commented Aug 12, 2021

@hediet nicely outlined what are the benefits that this is handled on the screen reader side. Apart from that we would need a different dependency based on platform, and for me it feels like a natural separation between screen reader and app: screen reader is in charge of producing audio (whether it is speech or sounds). Though this might be just my interpretation.

I have also emailed Dante from Visual Studio, so we hear what Visual Studio is doing.

@joanmarie
Copy link

If we were to do it through an aria alert (which is a live region), how would that work? alert and other live regions typically contain text to be spoken and optionally presented in braille. So we'd get an alert with no text and no understanding of why this is the case and should play some sound?

@joanmarie
Copy link

@isidorn Regarding "need a different dependency based on platform", found this from a quick search: https://code.tutsplus.com/tutorials/the-web-audio-api-adding-sound-to-your-web-app--cms-23790.

@isidorn
Copy link
Contributor

isidorn commented Aug 12, 2021

@joanmarie I first wanted to check if there is already some aria attribute that could be helpful in this case?
It seems like no. If there were one, let's call it aria-beep, then we could set the aria-beep on the live-region, or on some element before we pass focus to that element.

But you are right, doing this using an alert text would be hacky, since live regions typically contain text.

Thanks for the sound api. We could investigate to use it in general, there are definitely other use cases on the VS Code side. Though I would love to hear from Dante what Visual Studio is doing.

@zersiax
Copy link

zersiax commented Aug 12, 2021

screen reader is in charge of producing audio (whether it is speech or sounds).

@isidorn I both agree and disagree with you in this aspect. Screenreaders read the screen, that is their function. And you'd be absolutely correct in saying that that should include things like earcons; some screenreaders already have this built in up to a point.

On the other hand, these particular earcons (lightbulb, breakpoint, error, suggestion etc) are entirely unique to VS Code and wouldn't necessarily be used anywhere else.
So then the question becomes: Does the screenreader provide the sounds for this arbitrary single application? Or does the application provide its own? And if the latter, why, given only screenreaders would be using them?

I don't think there's a single answer to this question currently; but what we do have is knowledge of screenreader capabilities. I can tell you there is no screenreader-agnostic way to have a screenreader make a particular noise at a particular event. This would need to essentially be cusstom-built for VS Code in all screenreaders separately.
Visual Studio Community has some sounds built in already; these are not screenreader-specific as they include things like "build complete" etc. which they are now looking into expanding to cover use cases like this, and I think that is a sane way of going about it for this case as well. I'd love to hear people's thoughts on this.

@LeonarddeR
Copy link

Thinking more about this, while I personally enjoy sounds very much, relying on sounds only excludes deafblind people.

Another approach could be using contentEditable (#97154) and then provide this info as aria annotations in the editable content.

@hediet
Copy link
Member Author

hediet commented Aug 12, 2021

In the scenario of implementing sounds in the screenreader (not in VS Code), I think something like this could work really well:

  1. Initial state (no status present):
...
<div aria-live="polite" class="hidden-status">
</div>
...
  1. The user types, inline suggestions become available. The dom is changed by VS Code/JS to this:
...
<div aria-live="assertive" class="hidden-status">
    <div aria-notification-type="1">
        Inline Suggestions are available
    </div>
</div>
...

Instead of reading "Inline Suggestions are available" immediately, the screenreader would recognize the attribute aria-notification-type and play a sound (unique to all elements with "aria-notification-type"="1") immediately and in parallel to any speech. There is no meaning attached to the number 1, it refers to the first of a finite set of notification sounds.
If the user presses some global hotkey registered by the screenreader and types "1" (because the user recognizes this sound as sound 1), the screenreader could read the text "Inline Suggestions are available" to explain the meaning of it in this context.

A single application should not reuse notification types for different contexts.

Screenreader users could then change settings in the screenreader such as:

  • Should this sound be played in parallel to speech or interrupt it?
  • What sound in particular should be played?
  • Should this notification be announced acoustically or haptically (e.g. by a vibration pattern or by some button that expands)

There could also be other interesting configurable options, where a default could be indicated through some aria attribute:

  • Should there be a permanent background sound while the notification is present?
  • Should there be a sound when the notification disappears?

Thinking more about this, while I personally enjoy sounds very much, relying on sounds only excludes deafblind people.

I think this is an argument for having such an aria-notification-type attribute that is processed by some external applications. A screenreader with support for braille could present this notification haptically rather than acoustically.

Haptic notifications seem to be very much out of scope for VS Code.

Does the screenreader provide the sounds for this arbitrary single application?

I would say there could be like 10 different system-wide well-known notification sounds.
As long as these sounds have a consistent meaning in each single app, multiple apps can use them however they like.

these particular earcons (lightbulb, breakpoint, error, suggestion etc) are entirely unique to VS Code and wouldn't necessarily be used anywhere else.

They would immediately apply to all IDEs, not just VS Code.
And they could also be reused in other apps for different purposes.

I could imagine that such a status notification could also be useful for chat applications, indicating if a user got online or offline. Also, text editors/forms could use error-notifications to point out typos/malformed data.

@isidorn
Copy link
Contributor

isidorn commented Aug 12, 2021

Good points raised by @hediet
Also if VS Code started emitting sounds this could "clash" with Screen Reader sounds. And that is another argument why it feels more natural to me that the Screen Reader handles all audio, because then it can prioritise and merge multiple signals into a nice audio / braile output.

@zersiax
Copy link

zersiax commented Aug 12, 2021

@LeonarddeR makes a good point re: deafblind users, sounds wouldn't work for them. Is the aria annotations idea feasible at all?
As for screenreaders handling sound ... it would be rather unprecedented. No other app , at all, has this kind of behavior, where the screenreader plays sounds for a particular app. I have seen addons that do this, either OS-wide or , rarely, for a single screenreader within a single app, but essentially we'd be needing JAWS, NVDA, Orca, Chromevox, VoiceOver, ZDSR, PCTalker and who knows what else to all agree on a way to handle this one application's need for playing sounds.
If we had an aria property to do this, that would be rather easy; just wait until SR's support it and off you go. But we don't, which means any screenreader that either isn't aware that Code is using this unsupported ARIA property, or just unwilling or unable to add support for it because it's one single use case for one single app, is now unable to make use of an accesibility improvement until they switch screenreaders. To me, even though I can relate to this technically being a screenreader's responsibility up to a point, this feels hostile to users and screenreader developers.

@hediet
Copy link
Member Author

hediet commented Aug 12, 2021

@LeonarddeR makes a good point re: deafblind users, sounds wouldn't work for them.

But for deafblind users VS Code (afaik) cannot do anything without a screenreader, which makes an even stronger point for having an aria-attribute that is handled by the screen-reader (that can potentially interact with braille devices etc.).

But we don't, which means any screenreader that either isn't aware that Code is using this unsupported ARIA property, or just unwilling or unable to add support for it because it's one single use case for one single app, is now unable to make use of an accesibility improvement until they switch screenreaders. To me, even though I can relate to this technically being a screenreader's responsibility up to a point, this feels hostile to users and screenreader developers.

I agree.

@joanmarie
Copy link

Awareness by screen reader developers of this non-existent ARIA feature will require exposure via chrome/chromium.

Most screen readers are getting information about accessible objects via the accessibility tree; not directly accessing elements and their attributes from the DOM.

In other words just because you put aria-foo='bar' on an element, it does not mean screen readers can get at that information. Depends on the user agent (what to expose) and, I believe, the accessibility API (how to expose it).

@DanteGagne
Copy link

Thanks @isidorn for looping me in. I'm the editor owner for Visual Studio (i.e., not Visual Studio Code), and I wanted to chime in on some of the thoughts we're executing from.

Visual Studio uses the Windows system for our audio cues (which can be accessed as the "Sound Control Panel" from the Sound section of Windows Settings). This has a couple advantages and a couple disadvantages.

Going through the Windows system gives us a single interface for configuring all of the sounds, let's the user use the built in Sound Schemes that Windows provides, etc....

It also allows the system to be used by folks who don't use screen readers. In my mind, dudio cues fall into that same category as subtitles where it's designed for one community, but you don't have to be a member of the community to use it.

The most obvious downside is that it's a platform specific system. A cross-platform application like VSCode simply wouldn't be able to use the same system without providing mechanisms in every platform that's supported, which is obviously unreasonable.

The other downside is that the audio cues are coming from a different system than the screen reader. As I mentioned, this makes them usable to folks who don't use a screen reader, but now that the two systems are disjoint and not aware of each other, it's entirely possible for the sound to be played over part of what's being read. For instance, if the system is set to make a short beep (say 50ms) when the caret arrives on a line with a breakpoint, when the caret arrives on that line, the screen reader is going to start reading the line and the beep are going to play. We don't really have a mechanism to control when either event occurs and there's no guarantee that the beep isn't going to speak over some part of the text that's important.

The solution to the second point is to use extremely short audio cues or reserve audio cues for operations that are independent of user interaction. In the example I gave, the audio cue for arriving on a line with a breakpoint will ALWAYS occur immediately after the user took some action (navigation to the line in question). The screen reader would also be kicking in immediately after the user action, so these two events are almost always going to be in conflict. By contrast, an audio cue that indicates when a test pass has completed or a build operation has completed are going to happen at some arbitrary point in time that has a much lower chance of happening concurrently with a user action.

We've experimented with Live Regions in Visual Studio for audio cues, but at least in the very brief explorations we made, it wasn't consistent when the Live Region and the screen reader hit each other. Sometimes the cue would appear at the beginning of the spoken text and other times after (at least, that's what I remember running into).

I do think audio cues are extremely important and I want to figure out how to get this right. I sincerely think a beep or something to indicate when the caret has arrived on a piece of code that has an adornment on it (e.g. A compilation error or some Quick Action) as opposed to the screen reader having to explicitly say "Code has an error"... but we haven't quite figured out a general enough solution to it yet.

@hediet
Copy link
Member Author

hediet commented Aug 12, 2021

but now that the two systems are disjoint and not aware of each other, it's entirely possible for the sound to be played over part of what's being read

Is this that bad? I can imagine that for some subtle audio cues it should be possible to clearly identify both the audio cue and the spoken text at the same time, even if they are played in parallel (I haven't tested this though).
If you can play text-to-speach and audio cues at the same time without sacrificing clarity, I think error/inline-suggestion-cues would be much less disturbing.

@DanteGagne
Copy link

but now that the two systems are disjoint and not aware of each other, it's entirely possible for the sound to be played over part of what's being read

Is this that bad? I can imagine that for some subtle audio cues it should be possible to clearly identify both the audio cue and the spoken text at the same time, even if they are played in parallel (I haven't tested this though).
If you can play text-to-speach and audio cues at the same time without sacrificing clarity, I think error/inline-suggestion-cues would be much less disturbing.

I'm actually not sure how bad it really is. We don't have a lot of users using the audio cues yet, and for a non-screen reader use who has Narrator speaking at 1x speed, it doesn't feel like it's that bad. But, how will it work in an actual "I'm doing work so my CPU is doing 20 different things and my speech is at 6x or higher".

We want to do a prototype and release it in an upcoming version of Visual Studio to get an idea of just how bad it is. But to be clear, this isn't preventing us from moving forward on the prototype. It's just one of the concerns/risks that we've considered while implementing it, and I felt like I needed to at least mention it here.

I'd love if the problem turns out to not be a problem... but in this context of VS Code, I think the platform specific issue is the higher order bit.

@hediet
Copy link
Member Author

hediet commented Jan 19, 2022

There are 3 different sounds since yesterday: For lines with foldable areas, with errors and with breakpoints. I think you might have heard the sound for foldable areas.

@isidorn
Copy link
Contributor

isidorn commented Jan 19, 2022

We are in the process of increasing the sound volume and we are considering if we should do the folded sound only when the region is folded, not foldable. More updates in the next couple of builds :)

@Neurrone
Copy link

Agree on having this folded sound only play when it is collapsed, otherwise it fires too often.

Keep up the great work!

@zersiax
Copy link

zersiax commented Jan 19, 2022

Agreed, as well. Folded means there's something to be gained from the sound being there, foldable not so much :)

@devinprater
Copy link

devinprater commented Jan 19, 2022 via email

@isidorn
Copy link
Contributor

isidorn commented Jan 19, 2022

@devinprater smart question. I think that code in most cases can be folded. It is hard to be in some code area which is not foldable and I would argue that in those cases the user would know that she or he is in the top level scope. Though I do not actually use folding so I would like to hear more feedback.

@zersiax
Copy link

zersiax commented Jan 19, 2022

@devinprater like @isidorn says, code can be foldeed pretty much anywhere and, to my knowledge at least, adheres to the block structure of whatever language you're working with. I don't know if you can fold in, say, the middle of a function, but you can fold at the initial line of one, or the initial line of a loop, conditional, class, that kind of thing. I rarely use it, but mostly because it wasn't all that accessible to do so. This might very well change that, as folding can also be really useful to see things like brace mismatches a lot clearer.

@devinprater
Copy link

devinprater commented Jan 19, 2022 via email

@isidorn
Copy link
Contributor

isidorn commented Jan 19, 2022

@devinprater cool. Then let's start with that - only play sound when the region is actually folded. And we can fine tune it in the future. Once we are happy with what we have for these we can think about word wrap - I agree with you that is a problem we should solve.

Actually @hediet and @gino-scarpino are doing most of the work, so you should thank them instead :)

@jvesouza
Copy link

This implementation has greatly increased my productivity. I didn't know the utility of folding part of the code. It helps me a lot in navigating the code using a screen reader.

Thanks @isidorn, @hediet and @gino-scarpino.
In my opinion you deserve a beer!

@zersiax
Copy link

zersiax commented Jan 19, 2022

@isidorn @hediet @gino-scarpino this is a great first effort and helps a bunch, thanks a lot for the work so far.
My feedback , also shared on Twitter is the following:

  • As was discussed, the "foldable" cue should probably rather be a "folded" cue, or they should be two different cues, but at that point we're getting into making config options for the various sounds, as some may prefer to hear when code is foldable while others would rather depend on their knowledge of the structure of the code they're working with and not have it trigger too often. The cue would be more useful if you could distinguish between foldable and folded, which you currently cannot unless I missed something :)
  • I love the brevity of these sounds, pretty much exactly what I hoped for, but the hcoice of frequency and amplitude has made them easy to drown out by text-to-speech voices, particularly ones that speak within a masculine vocal range; this is not that apparent for the foldable cue because it uses a lot of highs, but all the more so for breakpoint and error which are particularly difficult to hear with a screen reader talking over them

So yeah, I'd say bump up the volume of those two cues and split up "foldable"into "folded" and "foldable", or get rid of "foldable" entirely and ply for "folded" instead :)

@jareds
Copy link

jareds commented Jan 19, 2022

Following are my thoughts after using for a day.

  • I like the breakpoint sound. It's particularly useful when I have a debug session that kicks off and takes two or three minutes to hit a breakpoint.
  • The sound for folding and foldable should be split into two.
  • All sounds should have settings options to turn them on and off.
  • Because the foldable sound cannot be disabled I am switching to the standard version of VSCode instead of insiders. IN Java everything is a block.
  • I would play sounds at the default volume instead of adding a VSCode volume specific option. If VSCode and the screen reader should have different volume this can be done through normal OS application volume levels.

@devinprater
Copy link

devinprater commented Jan 20, 2022 via email

@gino-scarpino
Copy link

gino-scarpino commented Jan 20, 2022 via email

@isidorn
Copy link
Contributor

isidorn commented Jan 20, 2022

Great feedback all, thanks.
@zersiax agree 100% with your suggestion. That is what we plan to do

@jareds I like your point to keep the volume control to the OS application volume level. This is built in on Win, but there are apps on MacOs that control this. Anyways we can add volume control if needed only later. Setting for each sound might be a bit too fine grained for me, let's revisit this once we tune the folded sound which is too triggered too often now.

@devinprater I am not sure. Let's ping some nvda people so they are in the loop that we are adding sounds @LeonarddeR

@zersiax
Copy link

zersiax commented Jan 20, 2022

@devinprater the sounds are extremely brief at present so I'm not sure what you mean by "drawn out". If you mean "drowned out" the suggestion makes no sense to me given NVDA's audioducking would make the sounds quieter, not louder :)

@devinprater
Copy link

devinprater commented Jan 20, 2022 via email

@gino-scarpino
Copy link

gino-scarpino commented Jan 20, 2022 via email

@devinprater
Copy link

devinprater commented Jan 20, 2022 via email

@miguelsolorio
Copy link
Contributor

@isidorn I know I'm not the primary user for these sounds, but I do love enabling theses. Are there any plans to customize what sounds a user wants/does not want? For example, I've love to turn off the "folding" noises for myself but keep the rest.

@gino-scarpino
Copy link

gino-scarpino commented Jan 21, 2022 via email

@isidorn
Copy link
Contributor

isidorn commented Jan 22, 2022

@misolori we can introduce more granular control later if we feel like there are user looking for this. This can just be a new setting value for audioCues (e.g. json bag). For your example: folding noises are too common and we are working on changing that now - to only play the sound when folded, not foldable.

@LeonarddeR
Copy link

@devinprater wrote:

I wonder if NVDA ducking can remedy some of the sound being drown out?

Actually when NVDA ducking is enabled, the volume of sounds coming from VS Code is lowered. This also applies to JAWS and Narrator ducking I suppose.

@MarcoZehe
Copy link
Contributor

I wish desktop screen readers were smarter about what they should duck and what not. For example, even with audio ducking enabled, it doesn't make sense to duck system notification sounds, or even the sounds the scren reader itself plays occasionally. On iOS, this works great, but on both Windows and Mac, either everything is ducked, or nothing. So if screen readers were smarter about that, that setting would not even matter for kind of system notification sounds like the one VS Code is producing now.

@zersiax
Copy link

zersiax commented Jan 25, 2022

@MarcoZehe not ...entirely accurate, I believe. Pretty sure VS Code uses its own thing to play sounds, not so much the same process that , say, the Windows Sound Scheme sounds use. VS IDE does try to hook into that same system for build complete and such, I think.
As for system notification noises, I can see cases for both ducking and not ducking notification sounds, e.. if you were to receive a whole bunch of them in rapid succession I can see that being annoying if you're trying to hear your screen reader, particularly if you have audio processing problems. I think the only way for everyone to have their cake and eat it too is to just make ducking configurable on a per-app basis, stick it in a profile or some such. I know of no screen reader that does this, though.

@isidorn
Copy link
Contributor

isidorn commented Jan 27, 2022

Since we have introduced sounds for breakpoint, errors and folded region I will close this issue.
Here's a follow up issue for next milestone where we plan to explore introducing more sounds
#141635

We are really looking forward to feedback and for ideas for other existing features that might benefit from the use of sound.

Thanks all!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accessibility Keyboard, mouse, ARIA, vision, screen readers (non-specific) issues feature-request Request for new features or functionality on-testplan
Projects
None yet
Development

No branches or pull requests