[idea] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) #1570

syranide · 2014-05-19T22:38:47Z

Part 1 of #1550

This PR removes data-reactid from the DOM and reconstructs it on-demand, then storing it as a hidden property __reactID on the node instead. Nodes and instances are traversed as pairs, in sync, when reconstructing.

Peformance comparison:
We have the option to choose between keeping the current on-demand traversal of nodes, or switching to immediate traversal after insertion into the DOM. Running my old super synthetic benchmark, creating 11 level deep trees of plain divs, measuring mount+unmount. The lazy method is roughly 15% faster it seems, adding just one attribute to all nodes reduces it to about 9%, adding one more reduces it to about 5% (for IE8 and Chrome).

However, it is important to note that these numbers are significantly inflated as they do not include reflow (which seems to increase running time of my tests by about 75%), actual rendering nor any other common React overhead such as composite components, handling event listeners, etc. It should also be noted that as of current, React will immediately traverse to any node containing an onClick-listener or an image with onLoad/onError-listeners or if you call getDOMNode in componentDidMount. Additionally, any updates applied to the DOM also require traversal to each affected node. None of which are included in this benchmark.

So while end-user measurable performance impact is hard to quantify, I think it's safe to say that at worst it may be effectively increased by 5% (excluding rendering), add a bunch of attributes, add some React overhead, onClick-listeners and we're down to perhaps 1-3%? In that context, I think it's important to consider that there are important real-life implications of on-demand traversal:

For immediate traversal; testing whether a node is a React-node is always O(1). We also avoid a lot of complex code that is otherwise there to ensure that a node exists in the cache, which means that going from node to instance to node is super cheap, which may open up some interesting potential down the road. We also detect invalid DOM nesting on insertion (for free). The size of React decreases "measurably" and code complexity is decreased.

For on-demand traversal; we must be able to traverse up the component instance parent tree and each node instance must immediately register itself with ReactMount. Testing whether a node is a React node is O(1) for cached React nodes and "up to" O(depth) for uncached React nodes or non-React nodes. We also lose the guaranteed/immediate validation of DOM nesting (although we can enable it for DEV).

There are a lot of improvements that this PR enables that is independent of immediate or on-demand traversal, but lets not make this wall of text any longer. I can see the benefit of both methods, but I think immediate traversal intrigues me the most given its modest cost, it's also uniform and deterministic during insertion and over time (not that it really matters given the modest performance implications, perhaps).

#1568 and #1569 must be fixed before this PR can be taken (or simple temporary workarounds have to be put in place). Add #1578 too and React leaves no lasting print in the visible DOM.

This PR has commits for both the immediate and on-demand traversal, both are fairly final implementations (obviously pending some decisions / feedback though).

Waiting for The Great Web Worker Refactor of 2014 #1434 to land (intersecting code refactored)
Depends on Only purgeID on ReactDOMComponent unmount #1568 to land
Depends on fix for dangerouslyReplaceNodeWithMarkupByID uses getNode on an unmounted/purged ID #1569 (@sebmarkbage?)
Lazy or immediate traversal?

chenglou · 2014-05-20T03:10:46Z

Oh man =D
For 1: possibility of removing renderToStaticMarkup, which is only there because of dangling checksum and react-id from renderComponentToString. By removing react-id I think there's enough benefit to remove the renderToStaticMarkup API altogether.

syranide · 2014-05-20T08:09:05Z

@chenglou You bet! I saw the discussion on IRC and while data-checksum may keep it from going away. However, we can actually get rid of data-checksum as we now traverse the DOM and can guarantee that an "incompatible DOM" cannot be reused, but we cannot guarantee that it's identical (different styles, texts, etc) which at the very least means that we probably want to keep it for the dev-build (but possibly drop it for the prod build, or make it optional for renderComponentToString perhaps?).

string renderComponentToString(ReactComponent component, boolean noChecksum)

Is one possibility.

chenglou · 2014-05-20T09:30:13Z

string renderComponentToString(ReactComponent component, boolean noChecksum)

That was actually the initial API for renderToStaticMarkup, but boolean args wasn't encouraged so I changed it. I was more interested in killing the API altogether.

syranide · 2014-05-20T20:15:07Z

I did a simple test of unboxing ReactTextComponent spans, and I couldn't even measure a noticeable difference (artificial test tree with tons of ReactTextComponents). A neater solution is to not generate the spans at all but to split/insert TextNodes during traversal instead.

However, knowing of the (by now very old and fixed) FF bug where it automatically split TextNodes (hard to test different FF versions) I'm unsure if there may be issues and if it's perhaps preferable to solve this higher up in React (i.e, not manage the TextNodes individually, but rather as a single TextNode).

sophiebits · 2014-05-20T20:17:06Z

I'm pretty sure I want to solve this higher up. At any rate, it should be a separate diff.

syranide · 2014-05-20T20:19:00Z

@spicyj Yeah, the more I think about it, that seems like the right way.

syranide · 2014-05-24T21:49:39Z

I did some preliminary benchmarking with the perf-tests and it looks really good! Even with just this one piece in place. It's hard to give firm numbers, but it's everything from 0-25% improvement, averaging at about 5-15% in tests that actually seem affected.

So, the current implementation is lazy again, it adds 3 new properties:

_parentNodeID is added to all React component instances, since we must be able to traverse up the instance tree to support lazy evaluation. A parentID is simply passed along-side rootID to component constructors and stored on the instance.
reactDOMInstancesByReactRootID is required by getNode, it is used to translate a React ID to an instance to a React node component, which is required to lazy evaluate its parents. This is currently in mountComponent.
owningInstancesForNodeByReactRootID is required only for DEV, it's used for constructing the "human readable path" to the error, as opposed to our old method of displaying the React ID, which is no longer visible (perhaps React DevTools could have a feature for finding a React ID?).

There are some ugly hacks while we wait for #1568 and #1569, all tests pass.

I have currently left some commented code in ReactMount. I can happily delete it, I wrote it under the assumption that _renderedChildren is an ordered list, which it wasn't, but given that _mountIndex is on the kill list I left it there for now. Perhaps the best solution would be to get rid of the dependence on _mountIndex and instead call evaluateNode/InstanceParents before processing updates for a node/instance. That way it could still rely on _renderedChildren being the correct order (which it is before the DOM is mutated).

registerDOMInstance is currently run on mountComponent which means it's run for renderComponentToString and renderComponentToStaticMarkup, which is bad! What's the right solution here? Simply add a check in mountComponent for transaction.renderingToString?

Something to consider; if we immediately traverse the DOM the user gets instant feedback on invalid nesting, not when you happen to access or modify it. It would be super easy to make it do that in DEV but keep the current behavior in PROD, but I fear that having separate behaviors for DEV and PROD could be dangerous. A possibility is to have DEV immediately only traverse the DOM and not cache the node/instance pairs, traversing the DOM is not significantly expensive (up to 10% of DOM construction cost it seems, not including rendering and layouting).

There are many more improvements I want to follow up with, but one step at a time.

   raw     gz Compared to master @ 46de927a28eb788b4329589658dbd31a014e4edd
 +3204   +375 build/react-with-addons.js
  +474   +132 build/react-with-addons.min.js
 +3204   +375 build/react.js
  +474   +140 build/react.min.js

syranide · 2014-07-10T18:36:35Z

Pushed another commit that uses Map when available instead of adding a property to the DOM nodes. Measurably faster for IE11, couldn't measure any difference for Chrome (with ES6).

sebmarkbage · 2014-07-10T19:37:58Z

Interesting stuff. I do like the simplicity of this. I wonder if there's a way we can bail out quickly (sort of a shouldComponentUpdate thing) for subtrees we know that we won't need to touch.

We should also evaluate just using document.createElement instead of batch generation, or a mix depending on heuristics/browser.

syranide · 2014-07-10T19:56:48Z

@sebmarkbage I believe @petehunt experimented a bit with cloning bare subtrees and reusing them. I think he managed to get some positive/interesting results (lists of similar/identical components are ideal and common).

I'm not sure what you mean about "bail out quickly", the latest commit in this PR currently implements the same lazy approach that React already uses (some earlier commit implements the non-lazy approach). There's only one practical disadvantage and it's that events originating on non-React nodes must always traverse up to the nearest React component or the document element (for the lazy approach). However as I mention/detail in the OP, the non-lazy approach is perhaps only 1-3% slower in the practical worst-case and brings a lot of other potentially interesting benefits, so that is worth considering as well IMHO (I haven't really done extensive benchmarking, but I really don't think I'm far off).

While I got you on the hook, #1569 is that something you may have fixed in your "rumored" refactor?

sebmarkbage · 2014-10-01T22:14:04Z

It's a pain to rebase these things and risky to make the switch immediately. I think we need feature flags for stuff like this. Probably build flags that can be stripped out.

Flagging as needing revision so that we can add build flags in a different PR and then this can go in under a new build flag.

syranide · 2014-10-01T22:18:57Z

@sebmarkbage I'm not expecting this to be merged any time soon, however, when _parentInstance makes its official appearance... It should then be quite simple to implement the full range of alternatives between today and no react ID (even behind flags I'm quite confident), some of which are far "less controversial" than this. But until there's a _parentInstance (one way or another) this is a no-go.

syranide · 2014-10-02T21:01:18Z

Hmm, pondering this some more, I'm sure I can make this happen without _parentInstance quite easily by simply relying on ReactInstanceHandles instead. It won't be as superficially neat, but it should allow us to test the full range of alternatives in relative isolation; data-react and no React ID, including lazy and non-lazy traversal.

@sebmarkbage How does that sound to you?

syranide · 2014-10-16T21:08:33Z

I went ahead and implemented it using ReactInstanceHandles for fun, it worked out really quite well, but the current core architecture is not conducive keeping it lazy (VDOM/DOM are inconsistent when ReactDOMIDOperations calls getNode).

But immediat traversal is possible as-is and easy and I'm quite sure is better than what we have today (as evidenced by previous perf tests), returning to lazy traversal is possible down the line... but I'm personally quite convinced that the benefits of immediate traversal outweigh the supposedly minor traverse overhead (which is also easier to hide than stuttering animation).

The implementation is actually quite isolated but I imagine you guys don't want to go ahead with until you can test performance with confidence (including lazy traversal).

syranide · 2014-10-25T11:06:47Z

Most of the useful stuff in this PR has already been dissected into other PRs and what's left isn't very useful.

gaearon · 2015-05-30T12:48:21Z

What is the status of this? Are there plans to eventually get rid of data-reactid? Is this on hold due to more pressing issues? Thanks!

syranide · 2015-05-30T13:35:39Z

@gaearon There's a PR (can't find it) for adding internal parent references which is necessary for the event system to work without data-reactid and there is one or two bugs in React which conflicts (but they can be workaround if they are willing). There is one alternative method that can be implemented without parent references, but... they expressed a desire to extensively benchmark and compare the various approaches before moving ahead.

As for server-side rendering without data-reactid, reconstructing data-reactid client-side is doable today and shouldn't come at a significant cost (and certainly favorable if you can drop data-reactid from the server-side generated markup). I've brought this up before but no one really seemed to care much so I never moved ahead with it.

callmevlad · 2015-05-31T22:20:49Z

I've brought this up before but no one really seemed to care much so I never moved ahead with it.

@syranide I care a lot and willing to put up a sizable bounty to get this implemented. I'd love to hear your thoughts on the scope of changes still required to remove data-reactid from the DOM. (My email is in my profile just in case.)

sophiebits · 2015-05-31T22:22:32Z

@callmevlad Curious: Why do you care?

callmevlad · 2015-05-31T22:58:07Z

@spicyj At Webflow we have a design interface that wraps web inspector-type tools around an iframe that contains a website's actual DOM (as it will be published to the final hosted production environment). A huge selling point of Webflow has been that the contents of that iframe are 100% what-you-see-is-what-you-get and the-same-code-your-developer-would-write, and we routinely demo this in pitches by inspecting elements live as they are modified with Webflow's tools. It goes a long way to convince stubborn frontend developers at larger companies who believe that generating great code is not possible - by directly demonstrating it to them by looking at the DOM.

Now that we're shifting the rendering of the site canvas to React, that "WYSIWYG" property no longer holds true. Inspecting the DOM is decorated heavily with data-reactids, which now makes the code "look generated" and we start hearing things like "that's not how I would write the HTML, therefore we should stick to hand-coding our websites" from potential users/customers. We'd love to avoid that as much as possible.

Also, it just looks much nicer without all the IDs inside DevTools since you can see much more of the DOM in the same amount of real estate :)

sophiebits · 2015-05-31T23:05:37Z

Cool, thanks for the context. Webflow looks sweet. :)

syranide · 2015-06-01T08:19:12Z

@callmevlad What can be done somewhat non-invasively (i.e. an easy to maintain fork) at the moment is the "immediate traversal" version of this PR. There are two alternatives:

The very simplest being to simply immediately traverse any rendered nodes, copy data-reactid into node.reactid, remove data-reactid and update internalGetID to read from node.reactid instead (more or less). You still have to ship data-reactid with server-side markup though.
Again, immediately traverse any rendered nodes but reconstruct data-reactid (quite easy too) instead and store it in node.reactid. The rest is same as above, but this doesn't require you to ship data-reactid with the server-side markup.

I can't imagine the devs being OK with having this "hack" in the release version so it would have to be in a separate fork. But as I said earlier I could imagine the server-side part of alternative 2 being shipped though (but not my call) which would make the client-side fork near trivial. I don't imagine the performance impact of this being significant.

For reference, most of the changes in this PR is because of the lazy traversal and updated tests. It's possible I overlooked something minor above, but this PR was functional so it's definitely doable.

natew · 2015-10-20T05:40:45Z

@syranide Sorry to bring up a possible dead issue, but this is also desirable to me / my team. It seems like your summary is that it's doable, why would devs not be ok with this? Anyway, just wanted to voice more support for this, for a similar reason to Webflow.

syranide · 2015-10-20T07:09:57Z

@natew Good news #5190 #5205

chyzwar · 2016-01-25T09:35:48Z

Is there is any chance this will land in 0.15 or 0.16?

zpao · 2016-01-25T20:40:30Z

It will be a part of 0.15

callmevlad · 2016-04-08T05:19:21Z

So glad this finally landed - thanks @syranide @spicyj @gaearon @zpao for helping push this through! 👏 👏 👏

syranide mentioned this pull request May 19, 2014

Proof-of-concept, remove data-reactid and nodeCache, traverse inserted markup immediately #1550

Closed

syranide mentioned this pull request May 21, 2014

Clean up data-react-checksum after checksum match #1578

Closed

Remove data-reactid and traverse immediately after insertion instead

71bc932

syranide mentioned this pull request May 23, 2014

When updating from many-item to single-item array, re-render is forced although item key hasn't changed #1493

Closed

Lazy traverse instead

d103b3b

syranide changed the title ~~Remove data-reactid and traverse immediately after insertion instead~~ Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) May 24, 2014

Improvements to lazy eval and documentation

260e444

This was referenced May 25, 2014

The Great Web Worker Refactor of 2014 #1434

Closed

Warn when reordering nodes that have been moved outside of React #1610

Closed

Use Map when available

815165c

This was referenced Jul 24, 2014

Temporary workaround for use of purged ID in updateComponent #1922

Closed

Add reference to parent instance for component instances #1934

Closed

This was referenced Aug 3, 2014

Provide a friendly path for reactID errors in ReactMount #1987

Closed

Merge adjacent ReactTextComponents in flattenChildren #1989

Closed

syranide mentioned this pull request Sep 14, 2014

Can we hide data-reactid from DOM? #2190

Closed

syranide mentioned this pull request Sep 27, 2014

Improve and simplify findComponentRoot, do not optimistically traverse foreign nodes #2262

Closed

2 tasks

syranide changed the title ~~Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!)~~ [not yet] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) Sep 29, 2014

syranide changed the title ~~[not yet] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!)~~ [idea] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) Sep 29, 2014

sebmarkbage added the GH Review: needs-revision label Oct 1, 2014

syranide closed this Oct 25, 2014

syranide deleted the v2 branch October 25, 2014 11:06

syranide mentioned this pull request Dec 18, 2014

Collision between two React copies running on the same DOM (ReactMount: Two valid but unequal nodes with the same data-reactid: .0.0.2) #2713

Closed

syranide mentioned this pull request Jan 21, 2015

Feature request to strip data-reactid and return them as an array or string #2761

Closed

syranide mentioned this pull request Mar 3, 2015

Why is renderToString(<div/>) generating different id's and checksums on each call? Is this a defect? #3300

Closed

natew mentioned this pull request Oct 20, 2015

Remove React-ID steelbrain/motion#17

Closed

opichals mentioned this pull request Nov 11, 2015

Use script tags to demarcate text components #1871

Closed

facebook-github-bot added GH Review: review-needed and removed GH Review: needs-revision labels Apr 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[idea] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) #1570

[idea] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) #1570

syranide commented May 19, 2014

chenglou commented May 20, 2014

syranide commented May 20, 2014

chenglou commented May 20, 2014

syranide commented May 20, 2014

sophiebits commented May 20, 2014

syranide commented May 20, 2014

syranide commented May 24, 2014

syranide commented Jul 10, 2014

sebmarkbage commented Jul 10, 2014

syranide commented Jul 10, 2014

sebmarkbage commented Oct 1, 2014

syranide commented Oct 1, 2014

syranide commented Oct 2, 2014

syranide commented Oct 16, 2014

syranide commented Oct 25, 2014

gaearon commented May 30, 2015

syranide commented May 30, 2015

callmevlad commented May 31, 2015

sophiebits commented May 31, 2015

callmevlad commented May 31, 2015

sophiebits commented May 31, 2015

syranide commented Jun 1, 2015

natew commented Oct 20, 2015

syranide commented Oct 20, 2015

chyzwar commented Jan 25, 2016

zpao commented Jan 25, 2016

callmevlad commented Apr 8, 2016

[idea] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) #1570

[idea] Remove data-reactid and traverse immediately after insertion instead (nope, still lazy!) #1570

Conversation

syranide commented May 19, 2014

chenglou commented May 20, 2014

syranide commented May 20, 2014

chenglou commented May 20, 2014

syranide commented May 20, 2014

sophiebits commented May 20, 2014

syranide commented May 20, 2014

syranide commented May 24, 2014

syranide commented Jul 10, 2014

sebmarkbage commented Jul 10, 2014

syranide commented Jul 10, 2014

sebmarkbage commented Oct 1, 2014

syranide commented Oct 1, 2014

syranide commented Oct 2, 2014

syranide commented Oct 16, 2014

syranide commented Oct 25, 2014

gaearon commented May 30, 2015

syranide commented May 30, 2015

callmevlad commented May 31, 2015

sophiebits commented May 31, 2015

callmevlad commented May 31, 2015

sophiebits commented May 31, 2015

syranide commented Jun 1, 2015

natew commented Oct 20, 2015

syranide commented Oct 20, 2015

chyzwar commented Jan 25, 2016

zpao commented Jan 25, 2016

callmevlad commented Apr 8, 2016