-
Notifications
You must be signed in to change notification settings - Fork 22.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selection.select and data inheritance. #1443
Comments
I assume you’ve read them already, but this is covered pretty well in Working with Selections (and earlier, Nested Selections). The key graf being:
Further, if you look at the implementation of selection.append and selection.insert, you’ll see there are just simple wrappers on top of selection.select. And I assume you’re not arguing to remove data inheritance from selection.append and selection.insert. While anecdotal, in my usage I typically do want selection.select to propagate data automatically. This is because selection.select implies there is a 1-to-1 relationship between the parent and child, and so it is almost always the case that the parent and child are bound to the same data. Thus either the propagation of data is a noop (the child already has the same data) or it is desired, as when the parent’s data is updated and you then want to update a child: var parent = d3.selectAll("element") // select some existing elements
.data(data) // join them to new data
.attr("foo", function(d) { return d.foo; }); // update parent attr to reflect new data
parent.select("child")
.attr("bar", function(d) { return d.bar; }); // update child attr to reflect new data The idea is that parent.select is parallel to the parent.append you would do on enter: var parent = d3.selectAll("element") // assuming no existing elements
.data(data) // join them to new data
.enter().append("element")
.attr("foo", function(d) { return d.foo; }); // initialize parent attr
parent.append("child")
.attr("bar", function(d) { return d.bar; }); // initialize child attr If we removed data inheritance from selection.select, then you have to make the propagation of data to the child explicit, either by forcing a second data join on the child selection ( I think this discussion would be improved if you could describe a specific example where selection.select propagating data is harmful. I recall vaguely at least one instance in the past where I didn’t want selection.select to propagate data, but I’ve forgotten the specifics, and I’m pretty sure the fix was a straightforward switch to selection.selectAll and a more restrictive selector (e.g., "child:first-child", or "child.special"). I’m going to close this issue, if only because this particular ship sailed three years ago and changing selection.select now would be highly disruptive, even for a new major version. But, I look forward to your reply, and perhaps there are other less disruptive ways we can address this confusion. Or maybe you’ll still change my mind, but don’t count on it. :) |
Thanks a lot for the reply. You're right, I didn't really think there was any cheese down this tunnel but you never know what kind of ideas pop up when you talk it through so I thought it worthwhile. (Actually, as I reached the end of typing this I actually did think of one potential idea but it's buried at the end. :) ) I'll try to explain a little bit about the scenario we're working with to give some context around how we bump up against this. The first point that's somewhat special is that we don't control the DOM. We have a visualization library that renders into a provided container element. Each visualization instance should not know about anything higher in the tree than the container element. This means we need to be careful to always contextualize selections under this container. Inside the visualization we have a couple of our own levels in the DOM (although over time we've been progressively flattening our SVG as we fight against the lack of explicit z-ordering!). Each visualization can be comprised of multiple plots and each plot can be comprised of many marks. The code is factored to separate responsibilities so, for example, a particular mark renderer is passed a "plot" container So, therein, as they say, lies the rub. The following seemingly contradictory things (at least from the perspective of data inheritance) are true:
Here's an example: <svg>
<g class="viz">
<g class="plot">
<defs>
... linear gradients, masks, usable elements
</defs>
<g class="datapoint">
... mark shapes and so on
</g>
.... more datapoints
</g>
</g>
</svg> So, the plot One thing that is true is that we definitely know very clearly which levels in the DOM have a parent-child data relationship and which ones don't. And the type of relationship between nodes never changes. If there was some way to indicate on the node that it's the end-of-line in terms of data inheritance, I think that would work for us. Do you think that would cause problems elsewhere? I suppose there would need to be something special done for Of course I would be very interested in hearing any other ideas you have either for d3 enhancements or techniques we can use to work around the issue. The technique we're using currently is called "teach people to not use selection.select unless they want data inheritance". That works too. :) |
I think the main thing you’re doing that’s different than what I commonly do is how you treat data. As far as encapsulation goes, I think of data as something that’s public, like an input argument to a function, rather than the encapsulated private state of a component. To me this is close to the idea of “data-driven documents”, in that the data is driving the transformation of the document. That’s why in my piece on reusable charts I bind data to an element and then use selection.call to apply the component. There are still cases, like with the axis component, where the data is rebound. But this is when there is a one-to-many relationship, as in the relationship between the parent axis and its main ticks. Perhaps related, the axis component also has private state. But this is stored on Of course, I suspect changing how you store data in your components would be pretty disruptive as well, but I’m curious to know what you think. |
One thing I should clarify first is that when I talk about a developer making an error with There are essentially 2 kinds of data we deal with. There's the data fed in by the caller, which as you say is absolutely public and treated as an input argument. Then there's data used internally by the implementation for various supporting purposes. Maybe this second type of data is derived from the first type or maybe it's just made up to drive some supporting structure of the visualization. With the first type there is no problem. There is generally a clear mapping between data elements and visual artifacts. As you say, this is truly the idea of a data-driven document where the data and the document are (to some extent) a reflection of one another. The second type is a bit different, though. This data is really an internal implementation detail that the outside world doesn't (and shouldn't) know about. You could argue that driving DOM structure using this kind of data is somewhat artificial and that's probably technically true. But the data joining and selection mechanisms in d3 are so convenient and powerful that people want to use them everywhere. So if I want to create a Looking a bit at the implementation of the
This uses an arbitrary 0 value but you could imagine binding some internal data about the path here. And if you do |
var path = g.selectAll(".domain").data([0]); This data-join is more about creating the domain path if it doesn’t exist than it is about selecting the domain path without propagating data; none of the code depends on the data that is bound to the domain path. And the ticks have different bound data because it’s a 1-to-many relationship with the axis. You raise an interesting point about multiple levels of abstraction / encapsulation. In most of my usage I make very little effort at encapsulation because I’m typically building a one-off graphic. Even my example of the reusable chart component is somewhat theoretical as there’s little point in building a chart component for a one-off. The closest I get to encapsulation in practice is loose coupling via custom events. The axis component and the brush component are also (limited) encapsulations. Both of these have some internal state which is captured by closure (how the components are configured). The axis component has some additional state that is hidden in the DOM, but in the semi-private Anyway, it’s difficult to speculate in the abstract about how you should use data, so I’m simply describing alternative different approaches. If you were designing a library in the most D3-ish way, using data joins and selections rather than the conventional way of instantiating each chart separately, then I’d recommend storing internal data in a separate property rather than risk clobbering bound data. Or perhaps this would apply to the design of components within your library, even if your library doesn’t expose the underlying D3-based implementation. Regardless of what design you choose, if you’re using D3, then being cognizant of how data is bound to elements is essential. The implicit way in which selection.select propagates data from parent to child perhaps encourages people to overlook this relationship. But, I still think propagation of data in the 1-to-1 case is the more common behavior, and in addition to my aversion for changing a core API, I also feel like it would be more tedious to propagate the data explicitly, and trade one confusion for another. (I’ve seen confusion in the past where on binding data to a parent, people are surprised that the child data is not automatically updated. Implementing this automatically would require a DOM traversal whenever data is bound!) I could maybe see introducing a way to select the first matching element without propagating data, say as a psuedo-selector in the style of jQuery ("element:first") or as a new method (selection.selectFirst). Yet this wouldn’t really address the original confusion, since you’d still need to understand the effect of selection.select on data to know to avoid it. And if you know this, you could just as easily construct a selector more carefully than learn the new API. |
I agree with this. Understanding the details of selections and data binding is essential. A bug based on unexpected behaviour is really just a learning opportunity. :) That sounds sarcastic but there is definitely some truth in it. I didn't understand the nuances of the 1-to-1 replacement nature of
No, I totally agree that a new API would only shuffle the issue around (not to mention adding a subtle variant to an existing API). Since changing What do you think about the idea of marking data as "uninheritable" at data bind time? Maybe with an additional (optional) argument on the |
Not a great answer, but one way you can prevent data inheritance is to have an intermediate node without bound data. var intermediary = selection.append("div")
.datum(function() { return null; }); Then, any select from the intermediary wouldn’t propagate data from the parent selection. But of course, the intermediate node in the DOM is somewhat unfortunate. |
That's a clever workaround. Not sure I could live with the extra DOM node, though. :) |
In the d3-graphviz library there is never a 1-to-1 relationship between parents and children and it is never the case that the parent and child are bound to the same data. Thus the propagation of data is never desired. Since the application where I use d3-graphviz frequently wants to access nodes and edges of the generated graph without affecting bound data, I've added an extension to d3-selection, selection.selectWithoutDataPropagation which does I also added a note to the library documentation to avoid the use of selection.select when building applications with d3-graphviz in order to avoid propagating incorrect data by mistake. Would you we willing to accept a PR to d3-selection adding this function? If so, would you prefer a different name, e.g selectPure? Even if this function is simple enough to implement outside d3-selection, its pure existence would further highlight that selection.select does more than just pure selection. |
I'm hesitant to open this as an issue because I know the behaviour of having
selection.select
inherit data from corresponding nodes in the parent selection is totally deliberate. But I think it's at least worth discussing so here goes.This side effect of
select
is something that confused me when I was first starting out with d3. And I regularly find myself explaining the behaviour to colleagues who are surprised by it. These explanations never seems to stop at the "what happens" but invariably continue into queries of "why would it do that"? I take this repeating pattern as a red flag that there is a rough edge in the API.I think the confusion stems from 2 points:
select
has strong connotations of being read-only from popular usage in SQL and other query languages. This seems to run pretty deep for people (myself included).select
andselectAll
in d3 besidesselection.select
are read only.I understand that changing this would be a pretty severe backward compatibility break, but just as a thought experiment, do you see any other issues with the following?:
select/selectAll
operations to be read only only.selection.update(selector)
to handle the update workflow where you truly do want to propagate data across nodes. This would essentially work just likeselection.select
does today.It's maybe not as seamless but there is something nice (to me) about having mutating operations be explicit even if it means sacrificing some convenience.
BTW, I posted a StackOverflow question a little while ago, but that was more targeted at techniques for working around practical issues we were hitting. It adds some relevant context to this topic, though, so here's a link.
The text was updated successfully, but these errors were encountered: