Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loops in task graph #7

Open
clarkfitzg opened this issue Feb 22, 2017 · 8 comments
Open

loops in task graph #7

clarkfitzg opened this issue Feb 22, 2017 · 8 comments

Comments

@clarkfitzg
Copy link
Contributor

If tasks are top level expressions in a script then they shouldn't refer to themselves with a loop. It's ok for variable graphs and call graphs though.

In this code I would expect a single edge from node 1 to 2.

l = list(a = 1)
l$b = l$a + 5

But the task graph includes an additional edge from node 2 to itself.

tg = CodeDepends::makeTaskGraph("dollars.R")
plot(tg)

image

@clarkfitzg
Copy link
Contributor Author

This seems to happen when a variable is redefined. A simpler example:

x = 10
x = x + 5

@duncantl
Copy link
Owner

Yep, we touched on this when we talked yesterday. In some ways, that is a new variable x.
Getting the semantics right requires some thinking about use cases. I suspect we want think
clearly what we are trying to "model'/describe in this graph and whether it is variables, updates, or expressions.

@clarkfitzg
Copy link
Contributor Author

The immediate use case I have in mind is looking at possibilities for parallel computation within the script through a use-define chain, so I'm interested in the dependencies only among expressions.

makeVariableGraph seems to handle the variables.

But you're right, that is a "new" x. And one might want to differentiate these, ie. x1, x2, ... to do more sophisticated things, ie. with control flow and conditional statements.

@duncantl
Copy link
Owner

Yes and x1, x2, etc. is essentially SSA - Single Static Assignment. You can talk to Nick about that.
As you say, makeVariableGraph seems to handle variables. If so, then change makeTaskGraph() do what you want. Key point is defining what it is that you want it to represent. It could be nuanced. But that's why it is fun :-)

@clarkfitzg
Copy link
Contributor Author

Great I'll work on a patch for this tomorrow.

@clarkfitzg
Copy link
Contributor Author

A few ways to do this:

  1. Put a start_index argument in getPropagateChanges to allow control over which expression to start looking at.
  2. Subset the expressions so that getPropagateChanges only looks at remaining expressions, excluding current and previous ones. Then recompute the original index.
  3. Create the graph more directly with the ScriptInfo objects.

I'm favoring the last one, because I'd also like to remove edges that aren't constraints on the computation. Ie. edges A -> B and B -> C implies A must come before C, so there's no need for an edge A -> C.

This should be simple to implement for a single expression of class ScriptNodeInfo- for each input x find the most recent expression that had x as an output.

An object of class "ScriptNodeInfo"
...
Slot "inputs":
[1] "x"

Slot "outputs":
[1] "x"

@clarkfitzg
Copy link
Contributor Author

Turns out I do want something nuanced and slightly different- a graph representing constraints on the order in which the code runs. So I'll work on that and hold off on making changes here.

@nick-ulle
Copy link

This paper might be relevant if you want go the SSA route: http://dl.acm.org/citation.cfm?doid=268946.268956

I'm thinking of adding array SSA to the ast package because I need something similar for type inference on list elements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants