Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that new Document creates a document of type "html", not "xml" #308

Closed
rniwa opened this issue Aug 19, 2016 · 26 comments
Closed

Clarify that new Document creates a document of type "html", not "xml" #308

rniwa opened this issue Aug 19, 2016 · 26 comments

Comments

@rniwa
Copy link
Collaborator

rniwa commented Aug 19, 2016

The current DOM spec says "The Document() constructor, when invoked, must return a new document whose origin is the origin of current global object’s associated Document." and there's an informal note saying "Unlike createDocument(), this constructor does not return an XMLDocument object, but a document (Document object)."

However, document's type is "xml" by default. So I'm confused as to what kind of document we're creating here.

I think what we intend to say here is that we want to create a document whose type is "xml".

@rniwa
Copy link
Collaborator Author

rniwa commented Aug 19, 2016

@annevk @cdumez @smaug----

@cdumez
Copy link

cdumez commented Aug 20, 2016

My understanding is that new Document() creates a Document object and its type will be "xml" indeed.

@rniwa
Copy link
Collaborator Author

rniwa commented Aug 20, 2016

My understanding is that we use Document for HTML documents since the DOM spec merged HTMLDocument into Document.

@cdumez
Copy link

cdumez commented Aug 20, 2016

If by "we" you mean the specification. All major browsers have an HTMLDocument type.

@domenic
Copy link
Member

domenic commented Aug 20, 2016

All that follows is about specs; implementations have not quite converged.

Almost all documents are Documents. This includes both XML and HTML documents.

However, there is a method, document.implementation.createDocument(), which returns an XMLDocument, because sometimes people used the load() method of the return value of createDocument() for Ajax-ish purposes.

In whatwg/html#1478 we removed XMLDocument.prototype.load since it was only implemented in Gecko, making XMLDocument an empty interface. This means we could probably kill XMLDocument entirely from the specs; that discussion is #278 and make all documents ever simply Documents.

A further complication: as of 2011 Gecko needs XMLDocument and its load method for web compat on Gecko-only code paths. whatwg/html#1530 tracks adding it back in Gecko compatibility mode, since Gecko has expressed that they prefer that to experimenting with removing it.

@ArkadiuszMichalski
Copy link
Contributor

Earlier someone has already asked about that #137. Why this constructor can't take additional argument to decide what document (internal xml or html) we want create? Now default is xml so we must use longer document.implementation.createHTMLDocument() which already has a predefined content.

@rniwa
Copy link
Collaborator Author

rniwa commented Aug 22, 2016

Well, it's strange for Document to create its subclass XMLDocument based on its argument. Since you could simply do new XMLDocument instead.

@annevk
Copy link
Member

annevk commented Aug 23, 2016

We use Document for HTML and "XML" documents. XMLDocument exists mostly because of load() (which only Firefox has at this point I think). E.g., XMLHttpRequest always returns Document from responseXML. This can sometimes be flagged as "xml", sometimes as "html".

@foolip
Copy link
Member

foolip commented Sep 5, 2016

Looks like XMLHttpRequest.prototype.responseXML returns an XMLDocument in both Gecko and WebKit even though they support the Document constructor in this test:
https://software.hixie.ch/utilities/js/live-dom-viewer/saved/4441

@annevk
Copy link
Member

annevk commented Sep 6, 2016

Interesting, does any implementation even support HTML responses for XMLHttpRequest's responseXML? Using that test of yours it seems they don't.

@foolip
Copy link
Member

foolip commented Sep 6, 2016

From Blink's source I see that one can get an HTMLDocument, if responseType is "document". https://software.hixie.ch/utilities/js/live-dom-viewer/saved/4443 seems to work in Chrome, Firefox and Safari everywhere, but Edge gives an "Unspecified error".

In the end, are there any APIs other than the Document constructor currently that can return a plain Document, or are they all HTMLDocument or XMLDocument? I suspect that latter.

@annevk
Copy link
Member

annevk commented Sep 6, 2016

Not sure, I suspect you are correct.

@domenic
Copy link
Member

domenic commented Sep 16, 2016

Since this has cropped up on blink-dev again, and @foolip and I have somewhat divergent opinions, let me outline what I think is the correct path forward in specs and implementations:

  • Document continues to return a Document (not a HTMLDocument or XMLDocument) whose type is "xml".
  • Implementations continue to move all members of HTMLDocument into Document. My understanding is that almost everything has been moved to Document in at least one browser, so this should be web-compatible.
  • We now have a situation where Document contains everything interesting; XMLDocument contains load() in Gecko and is empty everywhere else; and HTMLDocument is empty everywhere. The path forward could go a few ways depending on web compat.
    • If nobody wants to try any further simplification, we're done. We resurrect HTMLDocument in the specs as an empty interface, and make sure all the appropriate places return it instead of Document, like implementations do. (But the Document constructor stays unchanged.)
    • If people are up for trying a bit more simplification, we alias HTMLDocument to Document like the current specs do, and hope for the best.
    • We could even go further and non-Gecko browsers could alias XMLDocument to Document. Gecko could try to see if the web has evolved since 2011 when that was not Gecko-compatible, or it could stay the course. If it's not Gecko-compatible we encode that in the spec as part of Gecko compatibility mode.

@cdumez
Copy link

cdumez commented Sep 16, 2016

For the record, here is my opinion as well:

  • new Document() continues to return a Document (not a HTMLDocument or XMLDocument) whose type is "xml".
  • Update the DOM spec so that XMLDocument becomes an alias to Document (WebKit / Blink used to do this until they introduced the XMLDocument type to align with the spec. However, given that XMLDocument brings nothing on non-Gecko browsers, I'd love to go back to it being an alias).
  • Bring back HTMLDocument because unlike SVGDocument / XMLDocument, it has a decent amount of API that is only meaningful for HTML documents (e.g. document.write(), document.open()), or legacy API that I don't really want to expose to more document types (e.g. document.all(), document.bgColor).
  • Add a constructor to HTMLDocument

If a major browser besides Edge actually manages/decides to move everything from HTMLDocument to Document, then I could be convinced otherwise. However, it has been years and it has not happened. I personally do not think the "benefits" of merging HTMLDocument into Document are worth the effort / risks involved.

@foolip
Copy link
Member

foolip commented Sep 16, 2016

Update the DOM spec so that XMLDocument becomes an alias to Document (WebKit / Blink used to do this until they introduced the XMLDocument type to align with the spec. However, given that XMLDocument brings nothing on non-Gecko browsers, I'd love to go back to it being an alias).

Oh, how did I miss this? It looks like it was none other than @cdumez who added XMLDocument to Blink and WebKit, and recently too:
https://bugs.chromium.org/p/chromium/issues/detail?id=238372
https://bugs.webkit.org/show_bug.cgi?id=153378

Given that, it seems very likely that it can be made an alias of Document again in non-Gecko engines again, but if Gecko can't follow we'll be stuck in a weird place.

If not for the risk for Gecko, everything in #308 (comment) SGTM, including keeping a few things on HTMLDocument that would always throw on Document.

@bzbarsky, how do you view the chances that making XMLDocument an alias of Document in Gecko would be web compatible today? What was the original issue?

@domenic
Copy link
Member

domenic commented Sep 16, 2016

To save @bzbarsky some sighing, the original issue was https://www.w3.org/Bugs/Public/show_bug.cgi?id=14037. See also whatwg/html#1478 (comment). In 2011 there was code that UA-sniffs non-"applewebkit" and then uses XMLDocument.prototype.load in such places.

@bzbarsky
Copy link

What was the original issue?

Original issue for what?

We don't so much want to put a load method on all documents, because that has compat risks that don't seem worth having, right? Is the question why we need a load method on XMLDocument? Something else?

I feel like this is the 4th or 5th time I've had this conversation, and each time no one (including me) can find the previous instances because we keep switching bug systems and because Github's setup sucks so much for searching (e.g. the document discussion is scattered across issues in multiple repos, and possibly pull requests too).

@bzbarsky
Copy link

Clearly my comment crossed with Domenic's. ;) But case in point: His link to my github comment is to a pull request, not issue, and in the HTML repo, not this one. Searchability, what's that?

@cdumez
Copy link

cdumez commented Sep 16, 2016

I understand that Gecko needs XMLDocument and XMLDocument.protototype.load. However, now that we dropped XMLDocument.protototype.load from the HTML specification, it seems odd to keep XMLDocument as an interface in the DOM specification.

The situation, for years, was that Firefox had XMLDocument / XMLDocument.prototype.load and WebKit / Blink had XMLDocument as an alias to Document. It is unfortunate that Firefox needs XMLDocument / XMLDocument.prototype.load for backward compatibility. However, other browsers do not have XMLDocument.prototype.load (or intend to have) and they really do not need XMLDocument as a separate type AFAIK. This is why I am arguing for the DOM spec to be changed so that XMLDocument is an alias to Document.

Anyway, I do not have strong feelings. I just feel it would be a cleaner situation for WebKit / Blink.

@bzbarsky
Copy link

Sure. @foolip was asking about Gecko making them aliases, though.

@foolip
Copy link
Member

foolip commented Sep 18, 2016

Thanks @domenic, I've taken a look at those issues and virginamerica.com from 2011. The problem (from Sarissa 0.9.6.1) was XMLDocument.prototype.onreadystatechange = null and the fix was [LenientThis].

This was in a non-IE codepath, "applewebkit"-sniffing actually wasn't involved here.

Note that Sarissa by itself doesn't require the existence of XMLDocument.prototype.load, it just wraps it as @bzbarsky described. But it's only if some other script calls xmlDoc.load() that it matters, and virginamerica.com didn't AFAICT. It also doesn't seem to matter for Sarissa if XMLDocument is an alias of Document or a separate interface.

The question then remains, does Gecko need XMLDocument.prototype.load (and async) for compat? If it does, then it must be in some Gecko-only code path. Researching this with HTTP Archive would be very hard, any chance for use counters here?

@foolip
Copy link
Member

foolip commented Sep 28, 2016

I've tried to summarize everything I could find about document interfaces here:
https://gist.github.com/foolip/103963a1ae8598d2baedd296f4a1bf4c

Since the discussion is spread out, I arbitrarily suggest discussing the larger issue in #221

@foolip
Copy link
Member

foolip commented Sep 28, 2016

I think this issue ought to be closed, because the Document constructor as already implemented returns an "xml" document, so leaving that alone seems good. One of two things can then happen:

  • HTMLDocument is revived and gets its own constructor
  • Everything is successfully folded in Document, and its constructor is given options to pick between "xml" and "html", with "xml" as the default.

@rniwa
Copy link
Collaborator Author

rniwa commented Sep 28, 2016

Everything is successfully folded in Document, and its constructor is given options to pick between "xml" and "html", with "xml" as the default.

I don't think this will happen. It's a compatibility nightmare for what appears to be the most marginal gain on whatever people hoped to get out of it.

@foolip
Copy link
Member

foolip commented Sep 28, 2016

I also don't think it will happen or would be a good investment of time, just saying that closing this issue doesn't prevent it.

@annevk
Copy link
Member

annevk commented Sep 29, 2016

Given that we seem to reach consensus in #221 let's close this in favor of that.

@annevk annevk closed this as completed Sep 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants