Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-merge Document and HTMLDocument #221

Open
ayg opened this issue Apr 14, 2016 · 53 comments
Open

Un-merge Document and HTMLDocument #221

ayg opened this issue Apr 14, 2016 · 53 comments

Comments

@ayg
Copy link
Contributor

ayg commented Apr 14, 2016

Does anyone object? It's been years, nobody has done it, it's claimed that IE did it and then un-did it. Boris has serious reservations about web-compatibility. Is there any actual reason to merge them? Obviously, it makes sense to keep APIs on Document where reasonable, but I don't see why it needs to be wholesale.

@ayg
Copy link
Contributor Author

ayg commented Apr 14, 2016

@bz

@bzbarsky
Copy link

Note that my serious reservations are not based on data, but rather on lack of data. I could be convinced that this is ok, with actual data and analysis...

@foolip
Copy link
Member

foolip commented Apr 14, 2016

There has been some efforts to move things from HTMLDocument to Document in Blink, and it's been successful so far. https://bugs.chromium.org/p/chromium/issues/detail?id=238368

Of what remains I think that only document.all is worth worrying about. It's already on Document internally, but the risk is that some code depends on xmlDocument.all not existing.

I don't have time to spend on this right now, but if it'll get reverted in the spec I'd like a chance to try out the final changes in Blink, so let me know if there's a deadline.

@bzbarsky
Copy link

Right, moving particular APIs to Document might well make a lot of sense. Even moving them all might make sense. What doesn't make sense to me is doing the move in a single commit with no attempt to do per-property analysis.

@ayg
Copy link
Contributor Author

ayg commented Apr 14, 2016

So how about this: leave the spec as-is for now, start moving properties to Document on a per-case basis, and if we find any properties we don't want to try moving, change the spec to reintroduce HTMLDocument with those properties. Sound good to everyone?

@foolip
Copy link
Member

foolip commented Apr 14, 2016

Yep, that sounds good. I'm pretty optimistic about Blink being able to make HTMLDocument empty, but there could be engine-specific compat issues. Not sure how risky the final step of making HTMLDocument an alias is, we'll just have to try it.

@ayg
Copy link
Contributor Author

ayg commented Apr 17, 2016

Okay, so let's all try that.

@ayg ayg closed this as completed Apr 17, 2016
@cdumez
Copy link

cdumez commented Jul 29, 2016

What's the benefit of merging HTMLDocument into Document? Do we really want various HTML-specific attributes / operations to be exposed on non-HTML documents?

This is a risky change compatibility-wise and major browser engines all had (and still have) HTMLDocument. This has been in the specification for years and yet no major browser has not implemented it yet.

I personally do not understand the benefits of this change but even if there are benefits, is it really worth the trouble / risk for all browser engines? I know I am not inclined to do this change in WebKit.

@foolip
Copy link
Member

foolip commented Jul 29, 2016

Missing not in the last sentence?

In Blink most of what was in HTMLDocument has already been moved to Document, see #221 (comment).

The set of things that are in HTMLDocument in Gecko and WebKit is not aligned.

In Edge, where the IDL files aren't available, I see that Object.getOwnPropertyNames(HTMLDocument.prototype) and Object.getOwnPropertyNames(XMLDocument.prototype) both return just ["constructor"] while Object.getOwnPropertyNames(Document.prototype) returns ["constructor", "URL", "URLUnencoded", "activeElement", "alinkColor", "all", "anchors", "applets", "bgColor", "body", "characterSet", "charset", "compatMode", "cookie", "currentScript", "defaultCharset", "defaultView", "designMode", "dir", "doctype", "documentElement", "domain", "embeds", "fgColor", "forms", "head", "hidden", "images", "implementation", "inputEncoding", "lastModified", "linkColor", "links", "location", "msCSSOMElementFloatMetrics", "msCapsLockWarningOff", "onabort", "onactivate", "onbeforeactivate", "onbeforedeactivate", "onblur", "oncanplay", "oncanplaythrough", "onchange", "onclick", "oncontextmenu", "ondblclick", "ondeactivate", "ondrag", "ondragend", "ondragenter", "ondragleave", "ondragover", "ondragstart", "ondrop", "ondurationchange", "onemptied", "onended", "onerror", "onfocus", "oninput", "oninvalid", "onkeydown", "onkeypress", "onkeyup", "onload", "onloadeddata", "onloadedmetadata", "onloadstart", "onmousedown", "onmousemove", "onmouseout", "onmouseover", "onmouseup", "onmousewheel", "onmscontentzoom", "onmsgesturechange", "onmsgesturedoubletap", "onmsgestureend", "onmsgesturehold", "onmsgesturestart", "onmsgesturetap", "onmsinertiastart", "onmsmanipulationstatechanged", "onmssitemodejumplistitemremoved", "onmsthumbnailclick", "onpause", "onplay", "onplaying", "onpointerlockchange", "onpointerlockerror", "onprogress", "onratechange", "onreadystatechange", "onreset", "onscroll", "onseeked", "onseeking", "onselect", "onselectionchange", "onselectstart", "onstalled", "onstop", "onsubmit", "onsuspend", "ontimeupdate", "onvolumechange", "onwaiting", "onwebkitfullscreenchange", "onwebkitfullscreenerror", "plugins", "pointerLockElement", "readyState", "referrer", "rootElement", "scripts", "scrollingElement", "styleSheets", "title", "visibilityState", "vlinkColor", "webkitCurrentFullScreenElement", "webkitFullscreenElement", "webkitFullscreenEnabled", "webkitIsFullScreen", "xmlEncoding", "xmlStandalone", "xmlVersion", "onpointercancel", "onpointerdown", "onpointerenter", "onpointerleave", "onpointermove", "onpointerout", "onpointerover", "onpointerup", "onwheel", "adoptNode", "captureEvents", "caretRangeFromPoint", "clear", "close", "createAttribute", "createAttributeNS", "createCDATASection", "createComment", "createDocumentFragment", "createElement", "createElementNS", "createExpression", "createNSResolver", "createNodeIterator", "createProcessingInstruction", "createRange", "createTextNode", "createTreeWalker", "elementFromPoint", "evaluate", "execCommand", "execCommandShowHelp", "exitPointerLock", "focus", "getElementById", "getElementsByClassName", "getElementsByName", "getElementsByTagName", "getElementsByTagNameNS", "getSelection", "hasFocus", "importNode", "msElementsFromPoint", "msElementsFromRect", "open", "queryCommandEnabled", "queryCommandIndeterm", "queryCommandState", "queryCommandSupported", "queryCommandText", "queryCommandValue", "releaseEvents", "updateSettings", "webkitCancelFullScreen", "webkitExitFullscreen", "write", "writeln", "createEvent", "querySelector", "querySelectorAll"].

In other words, it looks like Edge has already moved everything to Document.

@cdumez, what do you think would be a reasonable thing to attempt to converge on? Based on the above, I think the most tractable path with relatively low risk is to move everything into Document, but then leave HTMLDocument an empty interface just like XMLDocument.

@cdumez
Copy link

cdumez commented Jul 29, 2016

Yes, I was missing a 'not' in my last sentence. It is now fixed.

It is interesting that Edge made did change, I was not aware.

I guess moving everything to Document while keeping an empty HTMLDocument is relatively low risk as you say. I guess I am just not clear yet on the benefits of doing this.

@foolip
Copy link
Member

foolip commented Jul 29, 2016

@annevk will know, but probably the original motivation was that it's possible to insert any element into any document, so having a type of the document itself doesn't achieve much. Also, I think, to reduce gratuitous differences between XHTML (XMLDocument) and HTML. (Some algorithms still have different behavior based on the document type, though.)

Whatever the history, we're now in a weird state and need to plot a course towards interop. Does at least trying to make HTMLDocument empty seem reasonable? Given the changes already made in Blink and Edge, this is very likely web compatible.

Whether to attempt the final step of aliasing I don't know. If we don't, we may need to rethink the Document constructor, perhaps instead making it such that HTMLDocument and XMLDocument are the only possible instances.

@cdumez
Copy link

cdumez commented Jul 29, 2016

I am curious to know what Firefox is planning to do. If Firefox goes in this direction as well then WebKit would likely follow. So far, Gecko and WebKit are mostly in agreement.

@foolip
Copy link
Member

foolip commented Aug 1, 2016

@smaug----, can you comment on Gecko plans?

@smaug----
Copy link
Collaborator

I agree with bz. We need to do this per property, not all at once. And while doing it, it is possible that we figure out the merge isn't really possible. We don't really know the behavior of the stuff Edge has.

Does Edge still have HTMLDocument interface? What is the prototype chain of an (html) document?

I'm tiny bit worried about stuff like open/write/close for example: do they work as expect with SVG or Mathml or such?

@foolip
Copy link
Member

foolip commented Aug 1, 2016

I agree with bz. We need to do this per property, not all at once. And while doing it, it is possible that we figure out the merge isn't really possible. We don't really know the behavior of the stuff Edge has.

We moved it in smaller parts in Blink as well. Testing what others do makes sense for each step. It seems pretty likely that it's web compatible to move everything, so the question is what you'd like to attempt to do in Gecko.

Does Edge still have HTMLDocument interface? What is the prototype chain of an (html) document?

Edge still has an HTMLDocument interface, and an instance has the expected prototype chain HTMLDocument : Document : Node : EventTarget.

I'm tiny bit worried about stuff like open/write/close for example: do they work as expect with SVG or Mathml or such?

In Blink, these and writeln() all throw InvalidStateError for XML documents, as per spec:
https://html.spec.whatwg.org/#dynamic-markup-insertion

In Edge, it looks like open() and close() don't throw an exception (not sure what they do) but write() and writeln() do.

So there's still some interop work to do here...

Finally, one thing that could go badly here if we all aim to make HTMLDocument and XMLDocument aliases of Document is that one or two engines succeed, but the next to try runs into engine-specific troubles similar to that around Gecko's XMLDocument.prototype.load, leaving us in an non-interoperable state for a long time.

@cdumez
Copy link

cdumez commented Aug 13, 2016

I am working on moving some properties from HTMLDocument to Document in WebKit. I am focusing on the properties that Blink has already moved and that I think make sense to have on Document (i.e. do not throw if called on anything else than an HTML document).

Since I don't anticipate HTMLDocument going away entirely anytime soon. I don't think it makes sense to move to Document attributes / operations that work only on HTML documents (e.g. open/close/write/writeln).

@ayg
Copy link
Contributor Author

ayg commented Aug 15, 2016

If some engines can't move one or two properties, the easiest solution would be to just move those back to HTMLDocument in the spec. As long as we have interop, it doesn't really matter where the properties are, so take the path of least resistance.

@cdumez
Copy link

cdumez commented Aug 16, 2016

Here is the current situation:
Chrome 52: [‘alinkColor', 'all', 'bgColor', 'captureEvents', 'clear', 'fgColor', 'linkColor', 'releaseEvents', 'vlinkColor']
WebKit ToT: [‘alinkColor', 'all', 'bgColor', 'captureEvents', 'clear', 'close', 'fgColor', 'linkColor', 'open', 'releaseEvents', 'vlinkColor', 'write', 'writeln']

So the only difference between Chrome and WebKit at the moment are the following properties:
"open", "close", "write" and "writeln".

As I said before, I did not move them on purpose because it seems the expected behavior is for them to throw when calling on anything else than an HTMLDocument. Given this behavior and the fact that we still have an HTMLDocument interface in WebKit, I see little point in moving them up to Document.

My hope is that an HTMLDocument interface is re-introduced in the HTML specification for the properties that are not useful on other types of documents (and legacy HTMLDocument attributes such as alinkColor since I don't think we want to expose legacy properties or more document types).

@annevk
Copy link
Member

annevk commented Aug 16, 2016

If we add it back we need to be very clear which contexts end up creating it although I suppose that is the case already due to "HTML document"…

@foolip
Copy link
Member

foolip commented Sep 5, 2016

Some thoughts on the WebKit ToT list from #221 (comment)

  • document.all looks to me like it it'd work just fine with any kind of document. The implementation itself is already on Document in Blink and when I tried moving it my ad-hoc test worked fine.
  • alinkColor, bgColor, fgColor, linkColor and vlinkColor work as long as there's a body element, and document.body is already on Document. There are also a bunch of collections like document.forms on Document.
  • clear(), captureEvents() and releaseEvents() are all no-ops so it doesn't matter much.
  • close, open, write and writeln would all throw exceptions and wouldn't be useful on Document. The one benefit of moving them anyway is that the exception message could say why it didn't work.

At the end of the day, I don't feel strongly about any of these, any course of action that would get us to interop would be fine.

The missing piece here is what to do about the Document constructor? Per #278 (comment) I think we should try to retain it, but what should it return? Gecko and WebKit both return a document with contentType "application/xml", but it's really unfortunate that it isn't an XMLDocument then. @cdumez @ayg, thoughts?

@foolip
Copy link
Member

foolip commented Sep 5, 2016

I did a little bit of research with SELECT page,url FROM [httparchive:har.2016_08_01_chrome_requests_bodies] WHERE body CONTAINS 'instanceof XMLDocument' in BigQuery:

It's immediately clear that instaceof HTMLDocument is way more common. I randomized the order and the first thing I found came from mutation-summary.js by @rafaelw:
https://github.com/rafaelw/mutation-summary/blob/421110f84178aa9e4098b38df83f727e5aea3d97/src/mutation-summary.js#L112

@rafaelw, I didn't look too closely, but I take it this code actually matters for getting the correct behavior in both XML and HTML documents?

Copies of this code seem to dominate the usage. http://cdn.clicktale.net/www/ChangeMonitor-latest.js showed up a lot but seems to include an older version of mutation-summary.

Given the volume of results, I'd be cautious about changing the truthyness of instanceof HTMLDocument without further analysis.

As for XMLDocument, after checking the most common URL patterns I couldn't find anything that seemed like it would break. So I'd give better odds to making XMLDocument an alias of Document, but that's not a good idea as long as Gecko needs XMLDocument.protoype.load.

@foolip
Copy link
Member

foolip commented Sep 5, 2016

#308 is also relevant.

I realized that "object HTMLDocument" and "object XMLDocument" are also relevant for compat. I find 606 instances of "object HTMLDocument" and 9243 instances of "object XMLDocument". I didn't dig deeply, but a lot of the latter seems to be MooTools:
https://github.com/mootools/mootools-core/blob/1be4d62e912e8acc25ddccd5f8059a9cba0862ed/Source/Slick/Slick.Finder.js#L22

Sigh.

@foolip
Copy link
Member

foolip commented Sep 22, 2016

In #221 (comment) I said "Not sure how risky the final step of making HTMLDocument an alias is, we'll just have to try it" upon which the issue was closed.

After #221 (comment) I'm a lot more cautious, and I think we should actually just revive HTMLDocument, unless there's concrete implementer interest for attempting the aliasing.

@foolip foolip reopened this Sep 22, 2016
@domenic
Copy link
Member

domenic commented Sep 22, 2016

@foolip that comment doesn't give an argument for reviving HTMLDocument. Being an alias preserves instanceof HTMLDocument working.

@foolip
Copy link
Member

foolip commented Sep 22, 2016

Yes, but it'd also start "working" for things that aren't currently instanceof HTMLDocument, the whole point of the check seems to be to get at the https://dom.spec.whatwg.org/#html-document state.

It's possible that it's safe because the code doesn't actually run in any non-HTML documents, but it'd require more analysis.

@domenic
Copy link
Member

domenic commented Sep 22, 2016

Right, my impression is that there are very few non-HTML documents out there, and people are just using it as a test that something is a document (and not, say, any other kind of node). But that is an assumption.

@foolip
Copy link
Member

foolip commented Sep 30, 2016

Do we use HTMLDocument objects for application/xhtml+xml responses?

Gecko seems to be alone in doing that right now.

I think the split between HTMLDocument and Document would depend on that a bit. It'd be nice to minimize the differences between HTML and XHTML, and if application/xhtml+xml becomes XMLDocument that's a good reason to move as much as possible to Document.

@rwlbuis
Copy link

rwlbuis commented Oct 13, 2017

Edge and I think Safari (https://trac.webkit.org/changeset/218437/webkit) moved all properties from HTMLDocument to Document, I plan to do the same in Blink:
https://chromium-review.googlesource.com/c/chromium/src/+/695815

@Zirro
Copy link
Contributor

Zirro commented Jan 3, 2018

This has caused a bit of confusion in #551. As long as this is unresolved the specification ought to contain a reference to this issue with a note mentioning that it does not necessarily reflect reality at the moment.

@annevk
Copy link
Member

annevk commented Jan 4, 2018

Though maybe if Chrome, Edge, and Safari have indeed shipped what the specification suggests and there's no longer a distinct HTMLDocument class this has indeed happened now (except in Firefox). I guess we still have a distinct XMLDocument class, but that's okay.

@dstorey
Copy link
Member

dstorey commented Jan 6, 2018

Yes, Edge matches the spec here (although we haven't removed the actual HTMLDocument interface)

/// @specs "dom-level-2-html"
[Deprecated]
interface HTMLDocument : Document
{
};

domenic pushed a commit to web-platform-tests/wpt that referenced this issue Jan 16, 2018
Per current spec, document should stringify to "[object Document]". Work
to remove "HTMLDocument" is tracked in
whatwg/dom#221.
@foolip
Copy link
Member

foolip commented Jan 26, 2018

@dstorey, the spec for this is https://html.spec.whatwg.org/multipage/window-object.html#htmldocument, which would mean that window.HTMLDocument===window.Document, ad-hoc test in https://jsbin.com/kidewim. Edge fails that way in the way one would suspect from the pasted IDL, namely that HTMLDocument is a distinct interface.

However, it's the same empty interface in Blink:
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLDocument.idl

And almost empty in WebKit:
https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/html/HTMLDocument.idl

But still lots of stuff in Gecko:
https://github.com/mozilla/gecko-dev/blob/master/dom/webidl/HTMLDocument.webidl

The situation is similar for XMLDocument:
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/dom/XMLDocument.idl
https://github.com/mozilla/gecko-dev/blob/master/dom/webidl/XMLDocument.webidl
https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/dom/XMLDocument.idl

http://web-confluence.appspot.com/ is also a good way to check this, but the HTMLDocument changes seem recent so haven't reached stable yet.

@DavidBruant
Copy link

But still lots of stuff in Gecko:
https://github.com/mozilla/gecko-dev/blob/master/dom/webidl/HTMLDocument.webidl

For reference, Mozilla is working on the move like in bug 1415588

@ExE-Boss
Copy link

This is probably a duplicate of whatwg/html#4792.

@annevk
Copy link
Member

annevk commented Dec 2, 2019

It needs to be tracked here as well, as it'll affect createHTMLDocument() at least, I assume.

@ExE-Boss
Copy link

ExE-Boss commented Dec 2, 2019

createHTMLDocument() creates a Document with type "html", which causes it to become an HTMLDocument to JavaScript code.

That’s how engines have been handling it since forever.

@annevk
Copy link
Member

annevk commented Dec 2, 2019

That's not how that works. Internal concepts are separate from the public class. And yes, it's known that there's a mismatch with implementations there, and as part of fixing that, this needs fixing.

@ExE-Boss
Copy link

ExE-Boss commented Dec 4, 2019

Since HTMLDocument is being defined in the HTML standard, should createHTMLDocument() be moved there as well, or should a cyclic dependency be created between the specifications?

@annevk
Copy link
Member

annevk commented Dec 4, 2019

Pretty sure there is one already.

awesomekling added a commit to awesomekling/serenity that referenced this issue Jun 21, 2023
This class is currently not in the spec, but it *is* still in all the
major browser engines. For compatibility reasons, let's do what other
engines do.

There is discussion about bringing HTMLDocument back into specs:
- whatwg/html#4792
- whatwg/dom#221
awesomekling added a commit to SerenityOS/serenity that referenced this issue Jun 21, 2023
This class is currently not in the spec, but it *is* still in all the
major browser engines. For compatibility reasons, let's do what other
engines do.

There is discussion about bringing HTMLDocument back into specs:
- whatwg/html#4792
- whatwg/dom#221
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

13 participants