SetDocument runs every time you switch the document #292

plusplusben · 2014-10-29T13:53:40Z

If sizzle switches between parsing a dom document and an xml document, it calls SetDocument every time it switches. Do you think it would make sense to cache more than just the current document settings?

I created a quick jsfiddle here http://jsfiddle.net/5j9xxocs/ . If you run a profile and then click on the "ClickMe" button, you can see that Sizzle's SetDocument is pretty expensive and runs twice for every iteration. I work on an application that uses Sizzle and switches between parsing XML and DOM frequently. I made a rudimentary fix on my local version of Sizzle that caches the settings for both document types and was able to pretty massively improve the performance of the app.

Would this be something worth adding to Sizzle? Or is my use case pretty edge?

timmywil · 2014-10-29T20:52:53Z

I would suspect this is not a common issue, but if we did want to cache all setDocument results, I'd want to limit the number of documents cached (to something like 5). The majority of cases where the document switches could probably be covered by a cache size of 2-5.

dmethvin · 2014-10-29T21:11:51Z

Are there any risks to caching a document, especially an XML one? I could see there being some downsides if you ended up caching a large XML document that would otherwise be garbage collected. With a single doc cache that's not very likely. If it's truly rare, another solution would be to load a second copy of Sizzle.

timmywil · 2014-10-29T21:31:06Z

There aren't risks to caching one document. The risk comes when there are many documents.

zachelrath · 2014-10-30T15:05:21Z

It seems like it would be more effective to have a cache based on "type" of Document, with separate entries only for each unique type of Document actually encountered, since a given browser should handle each type of Document consistently. For instance, If you're rapidly switching between separate XML documents, they're all going to have the same feature supports, but they're different Documents. So if the caching was along the lines of "if isXML, use XML doc support cache, if its a DOM fragment, use DOM fragment support cache, etc...".

With this approach, you don't have to deal with an arbitrary size for the number of Documents to cache, or the garbage collection problem, since actual Documents are not being cached.

Back to the Document-based caching approach, though, another possibility to address the garbage collection problem would be to remove cache entries whenever a given node that is an ownerDocument is removed or appended to another document, at which point that node can no longer be an ownerDocument.

dmethvin · 2014-10-30T17:01:57Z

It seems like it would be more effective to have a cache based on "type" of Document, with separate entries only for each unique type of Document actually encountered, since a given browser should handle each type of Document consistently.

You'd still have the potential issue of holding onto an XML document that would otherwise be garbage collectable, for example something fetched via AJAX. The last fetched XML doc would never be released until a new one arrived, if ever.

For instance, If you're rapidly switching between separate XML documents, they're all going to have the same feature supports, but they're different Documents. So if the caching was along the lines of "if isXML, use XML doc support cache, if its a DOM fragment, use DOM fragment support cache, etc...".

If there were really only two document implementations (XML and HTML) the setDocument() overhead could be reduced by caching feature detects the first time the document type was encountered. I don't think that's always the case though. Different implementations of XML documents can occur within the same page (for example, MSXML.DOMDocument, XML+XHTML, and XMLDocument).

another possibility to address the garbage collection problem would be to remove cache entries whenever a given node that is an ownerDocument is removed or appended to another document, at which point that node can no longer be an ownerDocument.

How would Sizzle know that happened? Also, someone could parse an XML string, select some nodes, and let the XML root go out of scope. This in particular seems like a hairy problem to tackle for an edge case.

gibson042 · 2014-10-30T17:56:04Z

This in particular seems like a hairy problem to tackle for an edge case.

Exactly. There are lots of good ideas and good intentions here, but in practice it's thorny. Pull requests would be evaluated on size, speed, memory consumption, and client compatibility, but I don't foresee myself taking this on.

timmywil · 2014-10-30T18:29:21Z

Consensus seems to be patch welcome. We would review a PR, but will not tackle this in the immediate future.

gibson042 · 2015-03-24T08:25:07Z

A thought for when this is picked up: cache setDocument output by documentElement nodeName/namespaceURI/etc. It's likely that implementations only privilege HTML and maybe SVG anyway, so we should be able to easily support cases that aren't absolutely bonkers.

- Works around jQuery jquery/sizzle#292 - Which happens when using jQuery to get values from the DOM

timespinner · 2016-10-14T17:28:29Z

Any chance this can be reviewed? We are seeing that this dramatically affects performance in an application we are building that alternates reading/writing to a xml document and using DOM elements.

…uery#292

wojwal · 2016-10-18T11:37:49Z

Hi guys,

I also faced this huge performance downgrade when Sizzle is constantly switching the document it works with (via setDocument). As a quick-fix I just applied a way how this behaviour can be disabled. That would solve the problem at least in some cases.

Hopefully this pull request will be accepted? :)

mgol · 2019-08-26T18:10:01Z

Copying my comments from #390:

If anyone wants to tackle the issue, this comment from Richard Gibson would be a good start: #292 (comment).

Note: in jQuery 4.0 that no longer uses Sizzle (instead inlining it & simplifying greatly) setDocument will not do as much as it does right now in Sizzle. This is the current code on master, without comments it's just:

function setDocument( node ) {
	var subWindow,
		doc = node ? node.ownerDocument || node : preferredDoc;

	if ( doc === document || doc.nodeType !== 9 ) {
		return;
	}

	document = doc;
	documentElement = document.documentElement;
	documentIsHTML = !jQuery.isXMLDoc( document );

	if ( preferredDoc !== document &&
		( subWindow = document.defaultView ) && subWindow.top !== subWindow ) {

		subWindow.addEventListener( "unload", unloadHandler );
	}
}

(code from https://github.com/jquery/jquery/blob/29a9544a4fb743491a42f827a6cf8627b7b99e0f/src/selector.js#L419-L445)
and it may even get smaller before the final release, who knows.

mgol · 2020-01-21T23:00:45Z

To show an example of the performance difference above, this is the test case provided in this issue using jQuery 3.4.1:
https://jsbin.com/cozegop/1/edit
If you open the browser console, you'll see the time measured; on my machine in Firefox 72.0.2 on macOS Catalina it's about 100 ms each time. The same test case with the jQuery version from master (i.e. future jQuery 4.0):
https://jsbin.com/pubobej/1/edit?html,js,output
prints times between 10 & 30 ms.

mgol · 2022-12-15T12:52:07Z

An update: jQuery 3.7.0 will not include Sizzle anymore, it has its own internal copy that's smaller than Sizzle.

I run the above test case with the current 3.x-git version and it's also pretty fast. This is mainly because DOM manipulation in jQuery uses jQuery.contains internally (in the isAttached util used by buildFragment) - an util previously being an alias of Sizzle.contains - which calls setDocument in Sizzle as it needs that to detect support for various native methods needed in contains. In jQuery 3.x-git there's only a single implementation of jQuery.contains so setDocument no longer needs to be called.

Test case: https://jsbin.com/yubaket/1/edit?html,js,output

mgol · 2023-09-07T12:56:05Z

We're going to sunset Sizzle soon. That said, I won't migrate this issue to jQuery as in jQuery 4.x it will not be an issue anymore due to how simple setDocument is there.

timmywil added 2 - Feature/Enhancement needsreview labels Oct 29, 2014

timmywil added patchwelcome and removed needsreview labels Oct 30, 2014

timmywil closed this as completed Oct 30, 2014

catrope mentioned this issue Jan 24, 2015

Sizzle.matches passes null context, causes document to be set to global document #311

Closed

gibson042 reopened this Mar 24, 2015

avernet added a commit to orbeon/orbeon-forms that referenced this issue Apr 9, 2015

Optimize by using the DOM API directly

318d988

- Works around jQuery jquery/sizzle#292 - Which happens when using jQuery to get values from the DOM

avernet added a commit to orbeon/orbeon-forms that referenced this issue Apr 10, 2015

Optimize by using the DOM API directly

41990ce

- Works around jQuery jquery/sizzle#292 - Which happens when using jQuery to get values from the DOM

wojwal added a commit to wojwal/sizzle that referenced this issue Oct 18, 2016

Core: autoDocument to control setDocument() auto-switch - patching jq…

34cfbf5

…uery#292

wojwal mentioned this issue Oct 18, 2016

Core: autoDocument to control setDocument() auto-switch - patching #292 #390

Closed

timmywil assigned gibson042 Nov 14, 2016

mgol removed the patchwelcome label Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SetDocument runs every time you switch the document #292

SetDocument runs every time you switch the document #292

plusplusben commented Oct 29, 2014

timmywil commented Oct 29, 2014

dmethvin commented Oct 29, 2014

timmywil commented Oct 29, 2014

zachelrath commented Oct 30, 2014

dmethvin commented Oct 30, 2014

gibson042 commented Oct 30, 2014

timmywil commented Oct 30, 2014

gibson042 commented Mar 24, 2015

timespinner commented Oct 14, 2016 •

edited

Loading

wojwal commented Oct 18, 2016

mgol commented Aug 26, 2019

mgol commented Jan 21, 2020

mgol commented Dec 15, 2022 •

edited

Loading

mgol commented Sep 7, 2023

SetDocument runs every time you switch the document #292

SetDocument runs every time you switch the document #292

Comments

plusplusben commented Oct 29, 2014

timmywil commented Oct 29, 2014

dmethvin commented Oct 29, 2014

timmywil commented Oct 29, 2014

zachelrath commented Oct 30, 2014

dmethvin commented Oct 30, 2014

gibson042 commented Oct 30, 2014

timmywil commented Oct 30, 2014

gibson042 commented Mar 24, 2015

timespinner commented Oct 14, 2016 • edited Loading

wojwal commented Oct 18, 2016

mgol commented Aug 26, 2019

mgol commented Jan 21, 2020

mgol commented Dec 15, 2022 • edited Loading

mgol commented Sep 7, 2023

timespinner commented Oct 14, 2016 •

edited

Loading

mgol commented Dec 15, 2022 •

edited

Loading