Replace textDirection property with recommendation for including control characters in the text #336

aaronpk · 2016-08-04T18:14:03Z

The textDirection property is an unproven theoretical solution to setting the base text direction. My understanding is it was added to this spec before it had any real implementation experience.

Currently, there are solutions in Unicode string encoding itself that can accomplish setting the base text direction. Using the existing Unicode solution has the added benefit of being supported by many systems without them doing any extra work. There is a great article by the i18n group that covers a handful of control characters and describes how to use them to set the base text direction: https://www.w3.org/International/questions/qa-bidi-unicode-controls

My recommendation for the annotation spec is to drop the textDirection property and instead include a recommendation to use the appropriate control characters as necessary.

The text was updated successfully, but these errors were encountered:

azaroth42 · 2016-08-04T18:22:50Z

Merging from #335:

The properties are not only for embedded strings (which in JSON we can expect to be unicode) but for arbitrary resources with URIs. I have no idea how PDFs store text strings (for example) and how well implemented the control characters are in those strings, but I can point you to many instances of older or just badly implemented XML documents in a huge variety of encodings. As these resources can take the role of the body of the Annotation, the unicode proposal isn't sufficient to address the requirements.

For example:

{"id": "http://example.org/annos/1",
  "type": "Annotation",
  "motivation": "commenting",
  "body": {
    "id": "http://example.com/old/text/thing",
    "type": "Text",
    "format": "application/old-text-format",
    "textDirection": "rtl",
    "processingLanguage": "ar",
    "created": "1997-08-17"
  },
  "target": "http://example.net/thing-that-document-is-about"
}

Note that the properties are listed under External Web Resources [1], not under Embedded Textual Body [2] for just this reason.

[1] https://www.w3.org/TR/annotation-model/#external-web-resources
[2] https://www.w3.org/TR/annotation-model/#embedded-textual-body

gsergiu · 2016-08-05T08:49:42Z

also merged from #335 ... #335 (comment)

I see it exactly the opposite.

One might need to know the text direction for correct representation of text embedded in the annotations (TextualBody), not for the correct respresentation of external resources.
The external resources must have included inside the "files/bitstreams" all information required for a correct representation. It is not the responsability of annotations to correct wrong html/pdf/xml.
(I might be a usecase for it ... but it is not included in the current version of the standard).

Probably some selectors would need this information, the "textDirection" might be relevant for the text position selection. In that case ... the selector must set the value inside selector and not inside teh target/body

BigBlueHat · 2016-08-05T12:21:29Z

We can't assume that everything is in Unicode, and storing this information within the Annotation document does help with selection as @gsergiu points out. It should not go on the selector itself, as that would only effect the direction of selected text and say nothing about the original documents text direction.

Given that we can't assume Unicode for external resources, we should keep textDirection as proposed. Implementors are more than welcome to use Unicode control characters within Embedded Textual Bodies as @azaroth42 points out.

gsergiu · 2016-08-05T12:36:30Z

@BigBlueHat why do you need the textDirection for external resources? What can you do with it except of proper text selection in selectors? (the annotation itself has nothing to do with the correct representation of external resources!)

azaroth42 · 2016-08-05T16:23:30Z

Discussed on the telco of 2016-08-05. The resolution was that there is no new information that wasn't already discussed. The proposal to use unicode control characters does not address the established need to cover non-unicode content, however much we might like to simply require unicode everywhere, retroactively.

Whether the features are valuable is the subject of #335, and thus we're closing this issue as the concrete proposal does not cover established requirements. Thank you for the proposal and bearing with us through the process!

tomerm · 2016-08-09T11:11:40Z

@aaronpk On modern OS (i.e. Windows, iOS, Android) text direction is associated with text rendering. In the storage both text with LTR or RTL or Auto text direction is represented the same way. Thus relating to text direction is best when it is done during rendering phase.

Unicode control characters or UCC (RLE, LRE etc .... ) are very valid means for enforcing text direction at rendering time and https://www.w3.org/International/questions/qa-bidi-unicode-controls indeed includes a lot of good use cases / examples.

However, using UCC for turning text direction into storage level property is cumbersome. For example for several following reasons:

Leveraging UCC this way is based on a hidden assumption that all rendering engines which display text are fully UBA complaint. In reality this is not so. Moreover some rendering engines (Adobe Reader) might have quite different approach for Bidi (Arabic / Hebrew) text rendering.
Search / sort capabilities in ALL technology / toolkit which allow rendering / processing of Bidi text (i.e. web technologies, back end technologies etc.) should be altered to ignore UCC (when they are injected into text to convey text direction information) during search / sort / concatenation and similar text based operations.
Editable contexts (mostly rich text editors ) use higher level protocols (i.e. HTML markup in web browsers) to convey text direction - very much similar to textDirection property discussed above. To support both higher level protocol and UCC we need to support extra mapping / conversion.
Many of higher level protocols (aka GUI SDK / toolkits) allow manipulation of text direction on the API level (instead of using UCC on the text level). Just a couple of examples:
a. textDir is supported by all widgets in Dojo Toolkit
b. setTextDirection function from Android
Supporting UCC for representing text direction in the storage will require modifications of all those technology / toolkits.

aaronpk added the i18n-review label Aug 4, 2016

aaronpk mentioned this issue Aug 4, 2016

The textDirection and processingLanguage properties are not needed #335

Closed

azaroth42 added the invalid label Aug 5, 2016

azaroth42 closed this as completed Aug 5, 2016

r12a mentioned this issue Aug 8, 2016

Replace textDirection property with recommendation for including control characters in the text #336 w3c/i18n-activity#200

Closed

js-choi mentioned this issue Nov 11, 2017

Add text language / direction attributes w3c/web-share#6

Closed

plehegar added i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. and removed i18n-review labels Mar 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace textDirection property with recommendation for including control characters in the text #336

Replace textDirection property with recommendation for including control characters in the text #336

aaronpk commented Aug 4, 2016

azaroth42 commented Aug 4, 2016 •

edited

Loading

gsergiu commented Aug 5, 2016

BigBlueHat commented Aug 5, 2016

gsergiu commented Aug 5, 2016

azaroth42 commented Aug 5, 2016

tomerm commented Aug 9, 2016 •

edited

Loading

Replace textDirection property with recommendation for including control characters in the text #336

Replace textDirection property with recommendation for including control characters in the text #336

Comments

aaronpk commented Aug 4, 2016

azaroth42 commented Aug 4, 2016 • edited Loading

gsergiu commented Aug 5, 2016

BigBlueHat commented Aug 5, 2016

gsergiu commented Aug 5, 2016

azaroth42 commented Aug 5, 2016

tomerm commented Aug 9, 2016 • edited Loading

azaroth42 commented Aug 4, 2016 •

edited

Loading

tomerm commented Aug 9, 2016 •

edited

Loading