Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML page lang and xml:lang match - problems with assumptions and applicability [5b7ae0] #1921

Closed
dd8 opened this issue Sep 15, 2022 · 5 comments · Fixed by #2086
Closed
Assignees

Comments

@dd8
Copy link
Collaborator

dd8 commented Sep 15, 2022

This is a follow on issue for #1172

We've done a lot of testing round lang and xml:lang and there seem to be problems with both the rule assumptions and applicability.

Historically there was a lot of spec churn over use of lang vs xml:lang which must have caused lots of implementation inconsistencies, but this was resolved in HTML 5 in 2014 and implementations are now very consistent. The inconsistencies between Chromium and other browsers for pages served as application/xhtml+xml was resolved in Chrome 88, and all browsers now behave identically. Notably Chrome 88 was released in January 2021 after this rule was created.

I think the key piece of text in the rule is:

Since most assistive technologies will consistently use lang over xml:lang when both are used, violation of this rule may not necessarily be a violation of WCAG 2. Only when there are inconsistencies between assistive technologies as to which attribute is used to determine the language does this lead to a violation of SC 3.1.1.

Rule problems

  1. We weren't able to find any inconsistencies in AT voicing for pages served as text/html, but did find inconsistencies in pages served as application/xhtml+xml documents. Unfortunately the rule only applies to text/html so I think it's never detecting true positives, and only flagging false positives.

  2. The rule assumes that fr, fr-FR and fr-CA are interchangeable and all voice as French. This isn't true for NVDA with the default OneCore synthesiser (see below).

text/html documents

  • all current screen readers uses the lang attribute for voicing when both lang and xml:lang are specified (matches the HTML spec)
  • the :lang() selector in browsers matches the lang attribute when both lang and xml:lang are specified (matches the HTML spec)

There are no inconsistencies in assistive technologies for documents served as text/html. Test results here:
https://www.powermapper.com/tests/screen-readers/content/html-page-lang-with-xml-lang/

application/xhtml+xml documents

  • all current screen readers uses the lang attribute for voicing when both lang and xml:lang are specified (this doesn't match the HTML spec)
  • the :lang() selector in browsers matches the xml:lang attribute when both lang and xml:lang are specified (matches the HTML spec)

Test results here:
https://www.powermapper.com/tests/screen-readers/content/xhtml-page-lang-with-xml-lang/

This means the language used for voicing may not match the CSS language leading to the wrong voice being used for content displayed using the :lang() selector.

This example displays 'C'est le français' and voices it correctly as French when served as text/html but displays 'Das ist deutsch' voiced as French when the page is served as application/xhtml+xml

		<html xmlns="http://www.w3.org/1999/xhtml" lang="fr" xml:lang="de">
		<head> 
			<title>Test for mismatching lang and xml:lang</title>
			<meta charset="utf-8"/>
			<link rel="stylesheet" href="SR-content-lang.css"/>
			<style>
				div:lang(fr)::before { content: "Un, deux, trois"; } 
				div:lang(de)::before { content: "Eins, zwei, drei"; } 
			</style>
		</head>
		<body>
			<h1 lang="en">Following elements inherit page language - hover to view CSS :lang()</h1>

			<p>garage</p>
			<p>double</p>
			<p>dame</p>
			<div></div>
		</body>
		</html>

Subtag matching

Most screen readers will voice lang=fr, lang=fr-FR and lang=fr-CA as French if a French voice are installed. The exception is NVDA using the default OneCore speech synthesiser which voices lang=fr, lang=fr-FR if the French (France) language pack is installed, but voices lang=fr-CA as English unless the French (Canada) language pack is installed. The same thing happens with lang=de, lang=de-DE and lang=de-AT and the language packs for German (Germany) and German (Austria)

Test results here:
https://www.powermapper.com/tests/screen-readers/content/html-lang-subtags/

If you have a French (France) language pack language pack installed you can hear this happening in NVDA on https://www.canada.ca/fr.html if you use dev tools to change lang=fr to lang=fr-CA

The problem doesn't happen with the legacy NVDA eSpeak synthesiser which maps all fr- subtags to the same robotic French voice.

@Jym77
Copy link
Collaborator

Jym77 commented Sep 21, 2022

I've been crunching some numbers on our customer data.
For us:

  • The rule is Applicable to 6% of the pages we check.
  • The rule fails for 0.2% of the applicable pages
  • The rule fails for 0.01% of all the pages we check.

Given the low number of failures this catch in real life, and the problems the rule seems to be consistently causing, I'm in favour of just deprecating it and stop spending any efforts on it…

I'm curious to see if other tool vendors are having similar numbers of greatly different ones 🤔 But if we all agree that the rule catches a problem on a few hundredth of a percent of all pages, I think it is not worth a lot of efforts to maintain…

@dd8
Copy link
Collaborator Author

dd8 commented Sep 22, 2022

I'd agree with deprecation. I'm assuming the numbers above apply to text/html (where AT behaviour is now consistent).

If the applicability was changed to application/xhtml+xml where there are still inconsistencies, then the number of failures would be tiny because application/xhtml+xml only accounts for 0.05% of all page loads (and only a small number of those would have mismatching lang/xml:lang)
https://commoncrawl.github.io/cc-crawl-statistics/plots/mimetypes

As a general point, rules that detect specific AT behaviours (e.g. inconsistencies between implementations) need constant re-testing because AT behaviours change over time (bugs are fixed, clarifications are added to specs, etc).

Edit:: This also poses problems for rule stability / reliability over time. For rules that detect AT behaviour:

  • either rule change as AT changes (so the rule isn't stable) or
  • rule stays stable but starts producing false positives

@Jym77
Copy link
Collaborator

Jym77 commented Sep 22, 2022

Edit:: This also poses problems for rule stability / reliability over time. For rules that detect AT behaviour:

  • either rule change as AT changes (so the rule isn't stable) or
  • rule stays stable but starts producing false positives

We've been touching this during the last CG call.
In short, the ACT rules TF reviews rules on a yearly basis and is making sure that they stay up-to-date with technologies. It probably makes their job easier if we (=rules writers) clearly mark the inconsistencies we found in the Accessibility Support section.

@dd8
Copy link
Collaborator Author

dd8 commented Sep 22, 2022

Definitely agree that the inconsistencies should be documented, otherwise you get into the same problem that happened when trying to deprecate WCAG 4.1.1 w3c/wcag#770

There was no documentation on which AT was affected by 4.1.1, so it's very hard to tell whether it still does anything useful 20 years later

@carlosapaduarte
Copy link
Member

Resolution from CG meeting: deprecate this rule

@carlosapaduarte carlosapaduarte self-assigned this Sep 22, 2022
@WilcoFiers WilcoFiers changed the title HTML page lang and xml:lang match - problems with assumptions and applicability HTML page lang and xml:lang match - problems with assumptions and applicability [5b7ae0] Sep 28, 2022
@WilcoFiers WilcoFiers self-assigned this Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants