loops in task graph #7

clarkfitzg · 2017-02-22T01:07:45Z

If tasks are top level expressions in a script then they shouldn't refer to themselves with a loop. It's ok for variable graphs and call graphs though.

In this code I would expect a single edge from node 1 to 2.

l = list(a = 1)
l$b = l$a + 5

But the task graph includes an additional edge from node 2 to itself.

tg = CodeDepends::makeTaskGraph("dollars.R")
plot(tg)

The text was updated successfully, but these errors were encountered:

clarkfitzg · 2017-02-23T01:38:20Z

This seems to happen when a variable is redefined. A simpler example:

x = 10
x = x + 5

duncantl · 2017-02-23T01:42:50Z

Yep, we touched on this when we talked yesterday. In some ways, that is a new variable x.
Getting the semantics right requires some thinking about use cases. I suspect we want think
clearly what we are trying to "model'/describe in this graph and whether it is variables, updates, or expressions.

clarkfitzg · 2017-02-23T02:09:58Z

The immediate use case I have in mind is looking at possibilities for parallel computation within the script through a use-define chain, so I'm interested in the dependencies only among expressions.

makeVariableGraph seems to handle the variables.

But you're right, that is a "new" x. And one might want to differentiate these, ie. x1, x2, ... to do more sophisticated things, ie. with control flow and conditional statements.

duncantl · 2017-02-23T02:13:00Z

Yes and x1, x2, etc. is essentially SSA - Single Static Assignment. You can talk to Nick about that.
As you say, makeVariableGraph seems to handle variables. If so, then change makeTaskGraph() do what you want. Key point is defining what it is that you want it to represent. It could be nuanced. But that's why it is fun :-)

clarkfitzg · 2017-02-23T02:31:40Z

Great I'll work on a patch for this tomorrow.

clarkfitzg · 2017-02-23T17:45:57Z

A few ways to do this:

Put a start_index argument in getPropagateChanges to allow control over which expression to start looking at.
Subset the expressions so that getPropagateChanges only looks at remaining expressions, excluding current and previous ones. Then recompute the original index.
Create the graph more directly with the ScriptInfo objects.

I'm favoring the last one, because I'd also like to remove edges that aren't constraints on the computation. Ie. edges A -> B and B -> C implies A must come before C, so there's no need for an edge A -> C.

This should be simple to implement for a single expression of class ScriptNodeInfo- for each input x find the most recent expression that had x as an output.

An object of class "ScriptNodeInfo"
...
Slot "inputs":
[1] "x"

Slot "outputs":
[1] "x"

clarkfitzg · 2017-02-27T19:32:52Z

Turns out I do want something nuanced and slightly different- a graph representing constraints on the order in which the code runs. So I'll work on that and hold off on making changes here.

nick-ulle · 2017-03-02T17:40:34Z

This paper might be relevant if you want go the SSA route: http://dl.acm.org/citation.cfm?doid=268946.268956

I'm thinking of adding array SSA to the ast package because I need something similar for type inference on list elements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loops in task graph #7

loops in task graph #7

clarkfitzg commented Feb 22, 2017

clarkfitzg commented Feb 23, 2017

duncantl commented Feb 23, 2017

clarkfitzg commented Feb 23, 2017

duncantl commented Feb 23, 2017

clarkfitzg commented Feb 23, 2017

clarkfitzg commented Feb 23, 2017

clarkfitzg commented Feb 27, 2017

nick-ulle commented Mar 2, 2017

loops in task graph #7

loops in task graph #7

Comments

clarkfitzg commented Feb 22, 2017

clarkfitzg commented Feb 23, 2017

duncantl commented Feb 23, 2017

clarkfitzg commented Feb 23, 2017

duncantl commented Feb 23, 2017

clarkfitzg commented Feb 23, 2017

clarkfitzg commented Feb 23, 2017

clarkfitzg commented Feb 27, 2017

nick-ulle commented Mar 2, 2017