Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the List "internal nodes" accessible through an API #541

Closed
slorber opened this issue Jul 8, 2015 · 19 comments
Closed

Make the List "internal nodes" accessible through an API #541

slorber opened this issue Jul 8, 2015 · 19 comments

Comments

@slorber
Copy link

slorber commented Jul 8, 2015

Hi,

If I understand correcly, ImmutableJS use a list that looks like Scala/Clojure vector with internal nodes of size 32.
I think exposing this internal structure may leverage better performances when rendering big immutable list. For example it an item has changed in an immutable list of 100000 elements, one does not have to iterate the 100000 elements and call shouldComponentUpdate 100000 times, but rather traverse the ImmutableJs List tree and call shouldComponentUpdate for the nodes, permitting to short-circuit the rendering faster and not have to call shouldComponentUpdate on all the 100000 items.

I mean if an item is modified in a big array of 100000 elements, the internal node holds an array of 32 references and the item "depth" in the tree is 3, instead of calling shouldComponentUpdate 100000 times, you can just call shouldComponentUpdate something like 4*32 times with PureRenderMixin.

I don't know how to achieve this in ImmutableJS nor if it's a good idea to use its internal structure if it's not part of the public API, but this could definitively boost performances.

See also related SO question: http://stackoverflow.com/questions/30976722/react-performance-rendering-big-list-with-purerendermixin

@slorber
Copy link
Author

slorber commented Jul 8, 2015

This React feature seems required also to avoid creating intermediate dom elements when we simply want to optimize the rendering of a flat list: facebook/react#2127

@leebyron
Copy link
Collaborator

Rather than exposing the internal structure of Immutable.js collections directly, we would probably want some method that is semantically the kind of operation you're proposing and bakes the performance improvements described into the implementation of such a method.

But I'm also not totally sure how what you're proposing would work or how you see it connecting to React.

Can you maybe type some slightly-imaginary future code that describes what you're trying to achieve from the React-side point of view, and maybe that would help me understand what Immutable.js would need to implement to fulfill it?

@slorber
Copy link
Author

slorber commented Jul 15, 2015

hi @leebyron

Here's an exemple implementation of what I want to achieve. As you can see, the number of shouldComponentUpdate calls becomes linear to the depth of the immutable structure, instead of being linear to the number of items of the list.

I simplified the implementation of an immutable list to be composed of nodes and leaves, where nodes are simple JS arrays.

It would be possible to create a facade to the ImmutableListRenderer component, so that it takes directly an ImmutableJS type.

/////////////////////////////////////////////////////////////////////
// Here is the "lib" code

var ImmutableListRenderer = React.createClass({
    // Simple implementation, but should use something like PureRenderMixin instead
    shouldComponentUpdate: function(nextProps) {
      return (nextProps.list !== this.props.list || nextProps.leafRenderer !== this.props.leafRenderer || nextProps.itemProps !== this.props.itemProps );
    },

    propTypes: {
        // The immutable structure intermediate node (or maybe a leaf)
        // Used the name "list" to make it more user-friendly from the outside only
        list: React.PropTypes.any.isRequired,
        // The way a leaf (ie an item of the list) should be rendered
        itemClass: React.PropTypes.func.isRequired,
        // A set of props that are passed to each item
        itemProps: React.PropTypes.object
    },

    render: function() {
        // Internal node case
        if ( this.props.list instanceof Array ) {
            // This should not be required to create an intermediate span element, but this feature is not available in React yet
            return (
                <span>
                    {this.props.list.map(function(subnode,index) { 
                        return <ImmutableListRenderer key={index} list={subnode} itemClass={this.props.itemClass} itemProps={this.props.itemProps}/>
                    }.bind(this))}
                </span>
            );
        } 
        // Leaf node case
        else {
            return React.createElement(
                this.props.itemClass,
                this.props.itemProps,
                this.props.list
            );
        }
    }
})


/////////////////////////////////////////////////////////////////////
// Here is the "client" code to demonstrate usage

// This is the supposedly simplified structure of an immutable list (I guess) of 1 to 18
var BigImmutableList = [
    [[1,2],[3,4],[5,6]],
    [[7,8],[9,10],[11,12]],
    [[13,14],[15,16],[17,18]]
]

// Hold the current list to render (like a global variable)
// The naming just comes from Clojure's Atom
var Atom = BigImmutableList;

// We only update the 1st item on the list and we increment it by 1
// and use structural sharing the most we can
// Obviously this should not be done this way but rather use ImmutableJS methods
function updateList() {
    Atom = [
        [[Atom[0][0][0]+1,Atom[0][0][1]],Atom[0][1],Atom[0][2]],
        Atom[1],
        Atom[2]
    ]
}


// Customize the way an item of the list is rendered
var ListItem = React.createClass({
    render: function() {
        return <span>{this.props.itemPrefix + this.props.children}</span>
    }
})

// These are "shared props" that will be passed down to all the list items
// If any of these props change, the whole list would have to re-render
var ItemProps = { itemPrefix: " -> " };

function render() {
    React.render(<ImmutableListRenderer list={Atom} itemClass={ListItem} itemProps={ItemProps}/>, document.getElementById('container'));
}

render();

setInterval(function() {
    console.debug("-------------------------------------");
    updateList();
    render();
}.bind(this),1000);

Here is a JsFiddle where you can see the shouldComponentUpdate calls in the log: https://jsfiddle.net/txfhzuey/2/

@slorber
Copy link
Author

slorber commented Jul 15, 2015

@leebyron I've made a quick POC of the same principles but using ImmutableJs internals (I mean the "Gn" object and _root/_tail properties)

You can see the core idea here:

var ImmutableListRenderer = React.createClass({
  render: function() {
    // Should not require to use wrapper <span> here but impossible for now
    return (<span>
        {this.props.list._root ? <GnRenderer gn={this.props.list._root}/> : undefined}
        {this.props.list._tail ? <GnRenderer gn={this.props.list._tail}/> : undefined}
</span>);
  }   
})


var GnRenderer = React.createClass({
    shouldComponentUpdate: function(nextProps) {
      console.debug("should update?",(nextProps.gn !== this.props.gn));
      return (nextProps.gn !== this.props.gn);
    },
    propTypes: {
        gn: React.PropTypes.object.isRequired,
    },
    render: function() {
        // Should not require to use wrapper <span> here but impossible for now
        return (
            <span>
                {this.props.gn.array.map(function(gnItem,index) { 
                    // TODO should check for Gn instead, because list items can be objects too...
                    var isGn = typeof gnItem === "object"
                    if ( isGn ) {
                        return <GnRenderer gn={gnItem}/>
                    } else {
                        // TODO should be able to customize the item rendering from outside
                        return <span>{" -> " + gnItem}</span>
                    }
                }.bind(this))}
            </span>
        );
    }
})

Here's a JsFiddle that show shouldComponentUpdate count: http://jsfiddle.net/txfhzuey/13/

As you can see, for a list of 10000 elements, it only requires ~77 calls to shouldComponentUpdate, while by normally iterating through the item list, it would have required 10000 calls

@dashed
Copy link
Contributor

dashed commented Jul 15, 2015

Seems like you should investigate cursors: https://github.com/facebook/immutable-js/tree/master/contrib/cursor

You can have ListItem re-render whenever there's an update on the data it uses. In addition, your top-down rendering path (from root to leaf) is shortened to just the leaf itself.

@slorber
Copy link
Author

slorber commented Jul 15, 2015

@dashed I know cursor and I actually implemented them in our startup framework: https://github.com/stample/atom-react

Most cursor implementations out there are "listenable" (ie we can register a callback). This permits to get better performances, but I think this is absolutly not necessary. If you look at Om's implementation, cursors are not listenable (https://github.com/omcljs/om/wiki/Cursors). I think the listener thing can improve the performances and works probably well but it's just not the path we are following.

In our framework, we ALWAYS render from the VERY TOP. We have something very akin to Flux stores, but they are not listenable either. This sure has some performance drawbacks but I'd like to keep the model very simple: f(M) = V. The only listenable structure we have is the Atom, which holds the global app state. For now it works fine in a complex SPA production app (check this video https://www.youtube.com/watch?v=zxN8FYYBcrI) and we'd like to push this architecture to its limits before introducing new abstractions like listenable stores and cursors.

See also this discussion: reduxjs/redux#155

Edit Oups Om have observable cursors as well (https://github.com/omcljs/om/wiki/Advanced-Tutorial#reference-cursors)


Also, using cursors, if instead of modifying one value of the list I push a new value at the end of the list, how would you do so that you don't have to iterate on the 10000 elements of the list and call shouldComponentUpdate on all the 10000 iterated items? Cursors seems to work fine to shortcircuit some React component renderings, but as far as I know it's a bit harder when the list size / cursor paths are dynamic.

@dashed
Copy link
Contributor

dashed commented Jul 16, 2015

In our framework, we ALWAYS render from the VERY TOP.

Is this something you absolutely need to do? You can still re-render substructures without cursors; so long as you use the right API to modify your data (e.g. immutable API).

One approach is this: https://medium.com/@gilbox/an-elegant-functional-architecture-for-react-faa3fb42b75b where the global app state is in the root component.

I've since modified the demo in the article to be able to re-render the sub-components, instead of top-down approach: http://codepen.io/anon/pen/oXeyxR

The refactor is based on my experiments with higher-order component with cursors: https://github.com/Dashed/orwell


Cursors seems to work fine to shortcircuit some React component renderings, but as far as I know it's a bit harder when the list size / cursor paths are dynamic.

With 'orwell', I'm able to observe cursors (custom implementation) directly, and the wrapped component will eagerly re-render whenever the cursor emits a change. If need be, I'm able to add some validation step to cancel/continue the re-render.


Also, using cursors, if instead of modifying one value of the list I push a new value at the end of the list, how would you do so that you don't have to iterate on the 10000 elements of the list and call shouldComponentUpdate on all the 10000 iterated items?

This would need a specialized solution. Doing array.map(mapper) isn't sufficient b/c you're still iterating N element.

Something like would be a better approach:

  1. cache the list/tree of components on initial render (probably on Immutable list)
  2. observe changes and construct splice steps and apply to cached component list/tree to update the necessary components
  3. return list/tree as children (no need for toJS(), since it's an iterable)

Observation in step 2 may depend whether your primary API of your global app state supports this.

@slorber
Copy link
Author

slorber commented Jul 16, 2015

@dashed while interesting I think this discussion is out of the scope of this issue.

You propose a "specialized solution" to solve my problem, using cursors, caching and "splice steps", but did not provide any concrete implementation. Using cursors ties the implementation to a specific cursor framework and here we are on ImmutableJS so please provide an implementation of your idea with ImmutableJS cursors :)

I propose a solution that solves the same problem efficiently without introducing anything new, just using React and a render function optimized to render persistent data structures.

I'll take this quote from @travisbrown that I really like:

It's just a solid development practice to use the least powerful abstraction that will get the job done. In principle this may allow optimizations that wouldn't otherwise be possible, but more importantly it makes the code we write more reusable.

http://stackoverflow.com/questions/12307965/method-parameters-validation-in-scala-with-for-comprehension-and-monads/12309023#12309023

@dashed
Copy link
Contributor

dashed commented Jul 16, 2015

The 'specialized solution' I proposed is independent of cursors. And the reason it's specialized is because ImmutableJS doesn't provide any API to access the previous versions (e.g. no API for it to be partially persistent); nor are there any API to know exactly which parts of the new immutable version that was modified.

This implies the need to wrap ImmutableJS objects in some fashion with 3rd-party libraries, unless it's somehow baked into ImmutableJS.

Hence, the need to create splice steps. Assuming you're using a flux-model, you can create splice steps at the same store that is creating/modifying the immutable data structure (b/c only at this point you know where in the data to modify).

This has to be adapted for every type of immutable collection.


You're overall goal is to minimize data traversals, and only have it at the area where a change occurred.

IMO, reliance on internal implementation (e.g. list._root) of the persistent data structure is not the solution.

@slorber
Copy link
Author

slorber commented Jul 16, 2015

Sorry @dashed please provide code example of what you are talking about because I can't understand. I don't know what you mean by "splice steps" nor why would you need an API to be able to know which "parts" of the immutable list has changed nor why you would need to wrap ImmutableJs.

Please tell me what's your problem with my initial idea because you seem to propose another solution to the same problem (that I don't understand) without giving any implementation and without really explaining why my solution does not satisfy you.

IMO, reliance on internal implementation (e.g. list._root) of the persistent data structure is not the solution.

I totally agree and this is the whole point of this issue: ImmutableJS could offer a stable/public API to support my usecase, instead of me having to rely on internal and unsafe implementation details.
The problem is already solved on my side, it is just I don't get any guarantee from ImmutableJs that the "unsafe contract" will not be broken in next release.

@slorber
Copy link
Author

slorber commented Sep 18, 2015

@leebyron did you have time to take a look at my jsfiddle?

@leebyron
Copy link
Collaborator

leebyron commented Oct 2, 2015

Sorry for the delay. I think this is really interesting and obviously the performance benefits are really interesting, but access to the implementation details is pretty dangerous as those are likely to change in the future as Immutable.js encodes more optimizations.

One thing that we should be able to leverage despite future changes is that Immutable collections are always structured as Trees. While exposing the structure of these trees are potentially fragile, perhaps there's some sort of memoizing map/fold we could support.

It seems like what you're finding is that when you have a very long list of components in React, a similar structural sharing approach with memoization is a powerful one. However It would be really interesting to see if a List -> Tree mapping transform could be useful to not just React but many other environments as well.

@slorber
Copy link
Author

slorber commented Oct 7, 2015

Happy to see you think my idea is interesting @leebyron

Unfortunatly I don't really know what can ImmutableJs expose safely for this performance enhancement, nor how it can be generalised to other libraries (like Mori?).

@slorber
Copy link
Author

slorber commented Jan 15, 2016

Hi @leebyron

In response to this SO question: http://stackoverflow.com/a/34788570/82609

I've discovered surprising facts while hacking on the answer.

You can check this JsFiddle: http://jsfiddle.net/slorber/txfhzuey/28/

It seems for example that doing list1.concat(list2) where list1 is huge and list2 is small (like 10 items) seems less performant than doing multiple list1.push(). The internal structure seems like completly updates when using concat.

Also noticed that adding/removing items at a random list index makes the whole internal structure updates too. I understand that having a performant slice on a very large list can be quite hard to implement, but wouldn't it be possible to support efficient single inserts/deletes at a random index without changing all the internal nodes of the list structure? Couldn't we provide some parameters to the list, to say that insertions inside the list could happen, by reserving some "empty space" or something?

@polgfred
Copy link

+1

I'm doing exactly the same thing with Immutable and React — I need to maintain a growing, and potentially very long, List. I've tried to isolate the code that deals with the VNode internals as much as possible because, as you say, those details may very well change; but the performance win is too dramatic not to do it — 50-100x when the List grows to thousands of elements. Of course I understand that exposing the internals is "unnecessary" to the List API per se, but it would be awesome to get at this structure in a dependable future-proof way.

Thanks for any input! And thanks for this awesome library.

@polgfred
Copy link

@slorber You can't insert or remove from a List without rebalancing the entire tree after that point. The constant time access guarantee is based on the fact that all buckets and sub-buckets are constant size.

@slorber
Copy link
Author

slorber commented Mar 25, 2016

@polgfred it was really an experiment I did not dig much in ImmutableJS internals yet :) I don't use this trick in production as I don't render such huge lists. But happy to see it used by other people :)

Anyway @polgfred maybe you can build a separate NPM project to do that. And for example you say in that project that it should be used with the exact same immutablejs version or something, to avoid incompatibilities

@polgfred
Copy link

@slorber Well, the way I'm using the List, I can guarantee that it hasn't been modified from the front, so the traversal logic is trivial. But that means I can't really package up what I'm doing as a general solution. In a pinch I could easily replace my List with an explicit tree structure that doesn't need to worry about insertion and alteration. But ths is the easiest approach for now.

@leebyron
Copy link
Collaborator

leebyron commented Mar 8, 2017

Closing this aging issue.

See #38 for an example of how implementation details may change in the future, especially around operations like slice and concat

@leebyron leebyron closed this as completed Mar 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants