Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

innerText: include parentheses around <rt> if there's no <rp> #1801

Closed
1 of 4 tasks
zcorpan opened this issue Sep 20, 2016 · 18 comments
Closed
1 of 4 tasks

innerText: include parentheses around <rt> if there's no <rp> #1801

zcorpan opened this issue Sep 20, 2016 · 18 comments
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest

Comments

@zcorpan
Copy link
Member

zcorpan commented Sep 20, 2016

https://html.spec.whatwg.org/multipage/dom.html#the-innertext-idl-attribute

The innerText getter has a special case for Text nodes that are children of rp elements; the text is included even though rp is 'display:none' by default.

Demo: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4488

This is nice but I think it is more common to omit rp and only use rt, and in that case it's not helping.

The rendering section has:

User agents that do not support correct ruby rendering are expected to render parentheses around the text of rt elements in the absence of rp elements.

https://html.spec.whatwg.org/multipage/rendering.html#phrasing-content-3

I think if we are going to special case ruby in innerText at all it would be good to make it "nice" also if rp is not being used, like in the rendering section.

Concretely, if a ruby element has no rp children, include "(" before rt children and ")" after.

cc @rniwa @rocallahan @jfkthame

Implementer interest:

  • WebKit: @rniwa (see below)
  • Chromium: ?
  • Gecko: ?
  • Edge: ?
@zcorpan zcorpan added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Sep 20, 2016
@rniwa
Copy link

rniwa commented Sep 20, 2016

Yeah, falling back to parentheses when there is no rp makes sense to me.

@kojiishi
Copy link

cc @yosinch

zcorpan added a commit that referenced this issue Nov 29, 2016
When there is no <rp> sibling to <rt>, include "(" before and ")"
after <rt> in `innerText`'s getter.

Fixes #1801.
@zcorpan
Copy link
Member Author

zcorpan commented Nov 29, 2016

PR for spec: #2113
PR for wpt: web-platform-tests/wpt#4259

@zcorpan zcorpan changed the title innerText: include parantheses around <rt> if there's no <rp> innerText: include parentheses around <rt> if there's no <rp> Nov 29, 2016
@zcorpan
Copy link
Member Author

zcorpan commented Nov 29, 2016

@jfkthame is there interest to implement this in Gecko?
@tkent-google is there interest to implement this in Chromium?
@travisleithead is there interest to implement this in Edge?

@jfkthame
Copy link

@upsuper wdyt about this? Would you like to take it for gecko?

@upsuper
Copy link
Member

upsuper commented Nov 29, 2016

There is an issue that, Gecko implements the ruby model from CSS Ruby spec, which is more complicated than that in the current HTML spec. The CSS model supports continuous <rt> elements as well as <rtc> element, which means the algorithm you proposed in #2113 wouldn't work for Gecko.

[Slightly offtopic: this proposal actually again highlights the defect of HTML's ruby model. This model fails to express words like " → 振り仮名(ふりがな)" in a reasonable way which has desired behavior on both rendering and plain text. HTML spec should really adopt the CSS Ruby model.]

There is also a question that whether the parentheses should be proportional or fullwidth (w3c/csswg-drafts#762). I think for CJK languages, majority of people would prefer either fullwidth parentheses or proportional parentheses with whitespace around. Maybe proportional parentheses are more desirable for other languages? Although it seems to me CJK languages (especially Japanese) are the main user of ruby.

Personally I don't like to see the algorithm of innerText becomes increasingly complicated. IIUC, it was speced this way for web compatibility, not really because of its distinct functionality (?). And thus I don't think it's worth adding anything to it unless for compatibility reason. I may be wrong about this.

@zcorpan
Copy link
Member Author

zcorpan commented Nov 29, 2016

Thanks @upsuper.

So with rtc, an rp might be a sibling of the rtc but not sibling of the rt. The algorithm could be changed to accommodate that, but first we should decide whether to do this at all.

I think fullwidth parentheses should be used if that is commonly used by CJK.

You are correct that innerText was added mainly for better Web compat.

I'm happy to drop the proposal if people think it is not worth it. My question then is, should we also drop the special handling of rp currently in the spec, which is implemented only in Gecko at the moment?

@zcorpan
Copy link
Member Author

zcorpan commented Nov 29, 2016

(The ruby model is issue #121.)

@upsuper
Copy link
Member

upsuper commented Nov 30, 2016

So with rtc, an rp might be a sibling of the rtc but not sibling of the rt. The algorithm could be changed to accommodate that, but first we should decide whether to do this at all.

rtc, rt, and rp can be a sibling of each other. The rule to add parentheses could be complicated. There is an attempt in CSS Ruby spec for generating parentheses automatically, but that rule isn't perfect, and probably doesn't fit well with description language used for innerText algorithm.

My question then is, should we also drop the special handling of rp currently in the spec, which is implemented only in Gecko at the moment?

I'm fine with doing this if no one else opposes.

@tkent-google
Copy link
Contributor

I agree with @upsuper about the last paragraph of #1801 (comment). Introducing new behaivor which is not compatible with any existing implementation isn't welcome.

@zcorpan
Copy link
Member Author

zcorpan commented Dec 2, 2016

Thanks. I've withdrawn the proposal. I will make a new pull request to drop the special handling of rp.

@zcorpan zcorpan closed this as completed Dec 2, 2016
zcorpan added a commit that referenced this issue Dec 2, 2016
This was a hack that is only implemented in one engine (Gecko).
For example, an <rp> in <select> is included, which is weird;
if an <rp> has several text nodes, only the first is included.

There is not enough implementor interest in making <ruby> actually
work well in innerText (e.g. in the absense of <rp>); see #1801.
@zcorpan
Copy link
Member Author

zcorpan commented Dec 2, 2016

zcorpan added a commit that referenced this issue Dec 2, 2016
This was a hack that is only implemented in one engine (Gecko).
For example, an <rp> in <select> is included, which is weird;
if an <rp> has several text nodes, only the first is included.

There is not enough implementor interest in making <ruby> actually
work well in innerText (e.g. in the absense of <rp>); see #1801.
domenic pushed a commit that referenced this issue Dec 3, 2016
This was a hack that is only implemented in one engine (Gecko).
For example, an <rp> in <select> is included, which is weird;
if an <rp> has several text nodes, only the first is included.

There is not enough implementor interest in making <ruby> actually
work well in innerText (e.g. in the absense of <rp>); see #1801.
@zcorpan
Copy link
Member Author

zcorpan commented Dec 5, 2016

zcorpan added a commit that referenced this issue Dec 5, 2016
In #1801 (comment)
it is pointed out that CJK typically use fullwidth parenthesis.

Also expand the character references in the example now that the
spec toolchain is not limited to ASCII.
zcorpan added a commit that referenced this issue Dec 5, 2016
In #1801 (comment)
it is pointed out that CJK typically use fullwidth parenthesis.

Also expand the character references in the example now that the
spec toolchain is not limited to ASCII.
@rniwa
Copy link

rniwa commented Dec 5, 2016

I think for CJK languages, majority of people would prefer either fullwidth parentheses or proportional parentheses with whitespace around. Maybe proportional parentheses are more desirable for other languages? Although it seems to me CJK languages (especially Japanese) are the main user of ruby.

That’s note quite true. People DO use proportional (half-width) parenthesis in Japanese without spaces. I’ve rarely seen anyone inserting spaces around parenthesis in Japanese for that matter.

You are correct that innerText was added mainly for better Web compat.

I'm happy to drop the proposal if people think it is not worth it. My question then is, should we also drop the special handling of rp currently in the spec, which is implemented only in Gecko at the moment?

Inserting parenthesis is quite important for copy & paste (otherwise important content can be lost during copy). WebKit uses the same algorithm for both innerText and coy & paste so this is quite important for us.

@kojiishi
Copy link

kojiishi commented Dec 6, 2016

I think for CJK languages, majority of people would prefer either fullwidth parentheses or proportional parentheses with whitespace around. Maybe proportional parentheses are more desirable for other languages? Although it seems to me CJK languages (especially Japanese) are the main user of ruby.

That’s note quite true. People DO use proportional (half-width) parenthesis in Japanese without spaces. I’ve rarely seen anyone inserting spaces around parenthesis in Japanese for that matter.

I agree, but since we have to choose one, I think you'll find "typically" if you look at referring bugs and discussion at I18N WG, and I agree with I18N that if we pick typically used one, that'd be fullwidth.

The larger issue than width is the baseline. ASCII parentheses are usually designed to match to x-height, which is too low to use for CJK, while fullwidth parentheses are designed to match to em-height. There are a few fonts that has em-height parentheses for ASCII parentheses but they're really a few, I know only 3, because doing so sacrifices English rendering.

In today's fonts environment, if we want parentheses that matches to CJK without extra spacing, we need to use fullwidth code points with pwid OpenType feature.

Inserting parenthesis is quite important for copy & paste...

I'll leave @tkent-google on whether we want to do this or not.

@rniwa
Copy link

rniwa commented Dec 6, 2016

In today's fonts environment, if we want parentheses that matches to CJK without extra spacing, we need to use fullwidth code points with paid OpenType feature.

The problem here is that this would mean that the lack of rp would now result in a full-width parenthesis being inserted even in English and Latin text, which is highly undesirable. Using half width parenthesis, on the other hand would still work for CJK even if it weren't ideal. We might need to resolve the current language from the nearest ancestor and decide whether to use full width or not.

@kojiishi
Copy link

kojiishi commented Dec 6, 2016

I know some people are taking about Ruby's useful for Latin and other languages, but have never seen single page using it. Have you?

Either way, it looks like Gecko and Blink doesn't want this. Maybe we should try to reach consensus on it first. Well, it was probably me who added the noise, sorry about that.

@kojiishi
Copy link

kojiishi commented Dec 7, 2016

Found the comment from @r12a.

alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
This was a hack that is only implemented in one engine (Gecko).
For example, an <rp> in <select> is included, which is weird;
if an <rp> has several text nodes, only the first is included.

There is not enough implementor interest in making <ruby> actually
work well in innerText (e.g. in the absense of <rp>); see whatwg#1801.
alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
In whatwg#1801 (comment)
it is pointed out that CJK typically use fullwidth parenthesis.

Also expand the character references in the example now that the
spec toolchain is not limited to ASCII.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest
Development

No branches or pull requests

6 participants