Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML parser: align custom elements with the template element #630

Closed
tomberek opened this issue Mar 13, 2017 · 33 comments
Closed

HTML parser: align custom elements with the template element #630

tomberek opened this issue Mar 13, 2017 · 33 comments

Comments

@tomberek
Copy link

http://jsbin.com/haceseqabi/edit?html,output

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36

Steps to reproduce the problem:

  1. Create Custom Element which is nothing but a re-named <ol> or <ul>
  2. Create a nested structure.

What is the expected behavior?
Render a nested list.

What went wrong?
The list becomes flat.

Did this work before? No

Does this work in other browsers? N/A

Chrome version: 57.0.2987.98 Channel: stable
OS Version: OS X 10.12.3
Flash Version:

<!doctype html>
<html>
<body>
<script>
  customElements.define('my-list2', class extends HTMLElement {
  constructor() {
    super();
     
    let shadowRoot = this.attachShadow({mode: 'open'});
    shadowRoot.innerHTML = `
  <ol>
      <slot></slot>
    </ol>
    `;
  }
  })
  </script>
  
  
  Custom:
<my-list2>
    <li>Thing
        <my-list2>
            <li>another</li>
            <li>and another</li>
        </my-list2>
    </li>
</my-list2>
  
 Should be:   
<ol>
    <li>Thing
        <ol>
            <li>another</li>
            <li>and another</li>
        </ol>
    </li>
</ol>
  
  
  
</body>
</html>
@hayatoito
Copy link
Contributor

hayatoito commented Mar 14, 2017

Can we close this issue, and use whatwg/html#2432 ?
See also #404.

@rniwa
Copy link
Collaborator

rniwa commented Mar 14, 2017

I think we should use this issue instead. The problem is more specific to web components than a generic ol / li behavior.

@domenic
Copy link
Collaborator

domenic commented Mar 14, 2017

Now I'm confused. Are we hoping for a one-off parser fix for ol/li, or are we hoping to revisit #59?

@rniwa
Copy link
Collaborator

rniwa commented Mar 14, 2017

I think we should revisit #113 and #624. We've seen dozens of issues with not having any mechanism to change the parser behavior in custom elements. I'm pretty convinced that we should at least re-examine the way to allow a custom element to be parsed as a self-closing element or a template-like element.

@domenic
Copy link
Collaborator

domenic commented Mar 14, 2017

OK! Well I'm very excited about such a possibility, and it sounds like now we have at least one implementer interested. Thoughts from @hsivonen, @dominiccooney, or @travisleithead for introducing new parser behavior into their engines?

@annevk
Copy link
Collaborator

annevk commented Mar 15, 2017

It seems much safer to go down the route of XML-with-defined-error-handling and have folks use that. That's much less likely to break things and a much less constrained environment.

@rniwa
Copy link
Collaborator

rniwa commented Mar 15, 2017

I don't follow what you mean by "the route of XML-with-defined-error-handling". Can you elaborate?

@annevk
Copy link
Collaborator

annevk commented Mar 15, 2017

@rniwa something I worked on a long time ago: https://annevankesteren.nl/2007/10/xml5 and more recently at https://github.com/Ygg01/xml5_draft.

@rniwa
Copy link
Collaborator

rniwa commented Mar 15, 2017

I don't think we want to introduce yet another parser. We can hardly maintain XML and HTML parsers.

Using a special character like ! to trigger a specific parser behavior or introducing some mechanism to declare that the parser should use a new mode is probably the way to go. The latter effectively introduces a versioning but we're already there with type=module and a bunch of new things we're introducing these days.

@annevk
Copy link
Collaborator

annevk commented Mar 15, 2017

It's not a new parser, it would replace the XML parser with something that's backwards compatible (except if you rely on errors).

@rniwa
Copy link
Collaborator

rniwa commented Mar 15, 2017

Well, we use libxml for parsing XML so having to support this parser would likely mean that we'd have to implement it ourselves since libxml is unlikely to adopt that new parser behavior anytime soon. And then we'd have to reimplement XPath, XSLT, etc... for backwards compatibility. In practice, that's never gonna happen.

@annevk
Copy link
Collaborator

annevk commented Mar 15, 2017

I don't see why you'd have to reimplement those though, you'd still generate the same kind of tree. In any event, such a change would automatically make all kinds of difficult situations in HTML extensible while being much more safe to execute.

@rniwa
Copy link
Collaborator

rniwa commented Mar 15, 2017

I don't think that approach is acceptable to us due to the immense implementation cost. If that's the only path Mozilla would agree to, then we should just not do this.

@annevk
Copy link
Collaborator

annevk commented Mar 16, 2017

I can't really say for Mozilla as a whole, but a couple folks are rather worried about any HTML parser churn. In particular with regards to security and also the wider ecosystem. Whenever the HTML parser changes, it's not just browsers that are affected, but also all kinds of libraries that might be harder to get updated. See e.g., whatwg/html#919 (comment).

That's why I'm trying to find alternatives, since I don't think our parser problems will go away.

@treshugart
Copy link

From a user, and framework-author perspective, how would allowing lists to contain custom elements be a breaking change to the parser (or for a custom list to contain native list items)? The same question goes for tables.

From a consumer perspective, this appears to be relaxing the current parsing restrictions.

@annevk
Copy link
Collaborator

annevk commented Mar 17, 2017

I think you're looking at it the wrong way. The parser handles all input and produces a certain output. Any kind of change is risky with such a setup, especially given that most content has validity errors and thereby trigger all the various branches of the parser. It's very different from no longer throwing an exception somewhere.

@annevk annevk added the v2 label Sep 4, 2017
@annevk annevk changed the title Nested List Nested List (HTML parser changes) Feb 18, 2018
@annevk annevk changed the title Nested List (HTML parser changes) HTML parser: align custom elements with the template element Mar 5, 2018
@annevk
Copy link
Collaborator

annevk commented Mar 5, 2018

There is some interest in aligning custom element parsing with the way <template> parses (apart from its stacking behavior). Mozilla is likely opposed (@hsivonen did not attend), but would follow others proving it.

@hsivonen
Copy link

hsivonen commented Mar 5, 2018

I'm sorry I missed the @ mention the first time round.

Like @annevk says, I'm very reluctant to tweak the HTML parsing algorithm. It's not just a matter of browsers tearing open at something that's well-tested and that we've achieved interop on. It's also a matter of various other pieces of software being able to trust that their implementation of the HTML parsing algorithm matches what browsers do. This may even have security implications. It's hard enough to get Web developers to believe that the right way to sanitize HTML is to use a real HTML parser that implements the algorithm from the spec. It's even worse if that's a moving target.

As for the use case shown in the initial report, isn't that what the is attribute is for?

@patrickkettner
Copy link

Just wanted to note that the edge folks have no hesitation in changing the parser. We'd be 100% fine with it

@pshihn
Copy link

pshihn commented Sep 24, 2019

I think this is easy to dismiss, but <p> for example, is a very commonly used tag that often contains more than text - images and videos in it.

<p>
  <my-element>
    <div>caption</div>
  </my-element>
</p>

gets parsed as:

<p>
  <my-element>
  </my-element>
</p>
<div>caption</div>

I don't like the idea of not being able to create inline custom-elements inside a paragraph. For example, a custom picture or video element.

Even if the output was the following, it would have worked for most cases:

<p>
</p>
<my-element>
  <div>caption</div>
</my-element>

There may be no appetite to change the HTML parser, but if you are thinking of Web Components, this is pretty sucky parse behavior.

@domenic
Copy link
Collaborator

domenic commented Sep 25, 2019

I don't think the contents of <my-element> should be a <div>, if <my-element> is meant to be used inside <p>. That is, I think <my-element> is designed incorrectly.

@pshihn
Copy link

pshihn commented Sep 25, 2019

I respectfully disagree, @domenic. The whole idea of having <slot> content is to let the user of the component inject custom html inside specified "slots". Component developers do not always have full control over how the component is used - i.e. not in-charge of the application.
There is no reason why custom-elements cannot be inline-block.

Take the <video> tag for example. In Chrome, the shadow dom of the element contains lots of <div>s, yet a video element can be inside a <p>. In fact, most CMSs often put inline images and videos inside the <p> tag. I may want to expose one of these divs as a <slot>, e.g. a custom video controls bar. Maybe a custom video element <my-video> that overlays the slotted html on top the video with some opacity - overlaid content or gifs or what not.

I could tell people using <my-video> element to not use inside a <p> because it is not designed to do so, but that would suck.

I understand the reasons why solving this is hard, but I do not think it's poor design, or something that WCs should not strive to solve. The whole idea of extending the HTML should not only be for blocked content.

@pshihn
Copy link

pshihn commented Sep 25, 2019

BTW, I like the idea of treating custom elements like <template> which they essentially are till fully initialized.

@domenic
Copy link
Collaborator

domenic commented Sep 25, 2019

Video has no light DOM children. If you only have shadow children there is no problem.

@pshihn
Copy link

pshihn commented Sep 25, 2019

I understand that. Just saying there are technical limitations to my ability to create certain kind of custom elements (e.g. an inline video player example described above), which I wouldn't like to have.
Not sure how this example can be conceived in any other way but by injecting slotted content via script

@domenic
Copy link
Collaborator

domenic commented Sep 25, 2019

Sure, and I'm saying the goal is not to allow no restrictions, but to allow new elements which obey the semantics of HTML. The semantics of HTML do not contain or allow children of p elements to contain divs, and we've designed custom elements to respect that invariant.

@pshihn
Copy link

pshihn commented Sep 25, 2019

Fair point.

Just for the sake of argument though - the limitation of <p> is in the HTML parser not in the DOM itself. You can append the<div> or any other blocked element as a child of the custom element via javascript and it all renders fine with the inline custom element inside the <p>, since the slotted content's layout is purely controlled in the shadow dom.

I do not want to harp on this for no reason, I have seen these issues in real and it's hard to explain people who use my components. Also, I was trying to show support for the title of the issue about dealing with custom elements akin to templates for parsing purposes.

@domenic
Copy link
Collaborator

domenic commented Sep 25, 2019

Yeah, you can definitely break the HTML content model using JavaScript :).

@pshihn
Copy link

pshihn commented Sep 25, 2019

Promise, last comment on this (tonight) :P

The spec for <p> says it will allow following as child elements:

a, audio, del, ins, map, noscript, or video element, or an autonomous custom element.

The last bit about custom elements is interesting, since in practice, most custom elements act like blocked elements like div, figure, etc which it explicitly doesn't allow. Not sure about the reasoning of including custom-elements in the <p> tag, where it explicitly is pretty restrictive otherwise. I can see why it should reject CEs because there's no way of knowing what the custom-element should be and should it even be allowed as a paragraph content.

There would have been no major issues if the custom element was rejected and the p tag would close like it does with divs.

@domenic
Copy link
Collaborator

domenic commented Sep 25, 2019

It allows custom elements so that you can create things like custom-strong etc.

The ECMAScript spec has a series of emu-* custom elements which are used in this manner.

@pshihn
Copy link

pshihn commented Sep 25, 2019

I understand that and have created similar elements myself; that was the whole point for my-video as well. My question was more related to being less restrictive in an otherwise very restrictive spec.
Anyways, not going anywhere here. Thanks for all the time :)

@GeorgeTailor
Copy link

GeorgeTailor commented Dec 15, 2020

For a <table> element this doesn't really make sense:

  • custom elements as a child of <thead>, which means that the renderer should go through anything that is inside that custom-element, strip anything that's not allowed in <thead> and render the rest;
  • custom element as a child of <table>, same as above.
  • custom element as a child of <tr>, same as above.

By applying the above approach <style> tag will be omitted as well and the user will be better off styling the table in a global CSS.
Maybe it would be possible to only allow <slot> tag in a <table> or any of its descendants to preserve already built in support for a11y, but allow additional styling for the slotted content.
That way slotted content will be projected to the <table> and then the engine will decide whether to render the slotted content or not, based on the current spec.
Example:

<table>
	<caption>
		<slot name="caption"></slot>
	</caption>
	<colgroup>
		<slot name="col"></slot>
	</colgroup>
	<thead>
		<slot name="header"></slot>
	</thead>
	<!-- zero or more tbody-->
	<slot name="body"></slot>
	<tfoot>
		<slot name="footer"></slot>
	</tfoot>
</table>

Additionally it will give access to the slotted content to the custom-element author via JavaScript.

@rniwa
Copy link
Collaborator

rniwa commented Apr 20, 2023

F2F Resolution: We can't change the HTML parser behavior at this point

@rniwa rniwa closed this as completed Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants