Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combinational Groups and Visible Results #207

Closed
sgpthomas opened this issue Sep 17, 2020 · 9 comments
Closed

Combinational Groups and Visible Results #207

sgpthomas opened this issue Sep 17, 2020 · 9 comments
Labels
Priority: High Blocking issues

Comments

@sgpthomas
Copy link
Collaborator

The current semantics don't have a clearly defined way to interpret combinational groups (groups that have "static"=0 or have group[done] = 1) or have a way to pass values between groups, like in the following example:

group do_mult {
  mult.right = 2;
  mult.left = 4;
  mult.go = 1;
  do_mult[done] = mult.done;
}

group save {
  res.in = mult.out;
  res.write_en = do_mult[done];
  save[done] = res.done;
}

control {
  seq {
    do_mult;
    save;
  }
}

The group save never gets to see the results of the multiply because the way that we compile control makes these two groups entirely disjoint.

I propose that we change the compilation so that for a group executing sequentially after another, get's to see the done signal of that group.

  • seq { A; B; }, B gets to see the done signal of A and thus can use the results of previous groups.
  • while port with G { B }: B can see the done of G when port is true and G can see the done of B
  • if port with G { T} else { F }: T or F can see the done of G
  • par { A; B; }: there is no guarantee about whether A or B can see each other's done signals

These semantics makes it clear how combinational groups should work. For example

seq { add; save_add; }

will take the same amount of time as just save_add alone and save_add will be able to use results produced in add.

This proposal works with the static timing annotations. We just shift the generated schedule to overlap on the last cycle.
So for the program for x = a + (b * c): we can use the schedule

0 1 2 3
mult c c r
add r
save c r

(c means computing, r means data is ready).
The old schedule required each group to save it's own results and looked like:

0 1 2 3
mult c c c r
add c r
save c r

Additionally this proposal makes it easier to break up long combinational chains, assuming that are specified in FuTIL control rather than in a group. Previously, an expression like a + b + c + d + e was compiled to a single group that computed the entire expression.
Now, because groups can see each others results, each binary expression can be expressed as it's own group and you can express the schedule like: seq { AB; CD; ABCDE }. It then becomes trivial to insert a register at any point in the chain and the schedule remains the same.

@rachitnigam rachitnigam added the S: Discussion needed Issues blocked on discussion label Sep 18, 2020
@sampsyo
Copy link
Contributor

sampsyo commented Sep 19, 2020

This seems great to me. I don't think I fully understand the risks, so there may be downsides I'm aware of, but the rule seems intuitive to me and there is an important upside about being able to express "operator chaining" of combinational logic.

@sampsyo
Copy link
Contributor

sampsyo commented Sep 22, 2020

Perhaps this is ready for prototyping, at least speculatively?

@cgyurgyik
Copy link
Collaborator

This is really cool. I had one question. For the following depiction of x = a + (b * c)

|      | 0 | 1 | 2 | 3 |
|------|---|---|---|---|
| mult | c | c | r |   |
| add  |   |   | r |   |
| save |   |   | c | r |

Cycles 0, 1 are used to compute the mult. Why don't we need at least a single cycle to compute the add? It seems like we are skipping the compute time for the add step here, but I may be misinterpreting how to read this.

@sgpthomas
Copy link
Collaborator Author

didn't see that you commented this, sorry about that. the thing about add is that it can be done combinationally. so it can add the output from mult and have it's results ready in the same cycle.

@cgyurgyik
Copy link
Collaborator

Ok that makes sense, thanks.

@sgpthomas
Copy link
Collaborator Author

Putting this here as a note:

We can't do this compilation without changing combinational groups to use [done] = [go] (whereas previously it was [done] = 1'b1). The reason for this is that with [done] = 1'b1 things following it always run in the second cycle (rather than after the completion of the previous group. Consider this example:

seq {
  write_reg; // takes 1 cycle
  do_add;  // combinational
  save_add;  // takes 1 cycle
}

We want write_reg to happen in the first cycle, and then both do_add and save_add to happen in the second cycle. However, because save_add happens on the previous groups done signal (do_add), which is always high because it's combinational, save_add runs in the first cycle.

If we instead follow the policy that [done] = [go] for combinational groups, this problem is avoided because save_add is correctly sequenced after write_reg because do_add's done isn't high until the second cycle.

This issue would also be avoided by guarding done holes with the go hole always. The problem with this, and this is the reason we current don't do this, is that it prevents use of conditioning the execution of a group based on when it isn't done. For example:

group A {
  ...
  A[done] = A[go] ? some_reg.done;
}

group seq {
  A[go] = cond & !A[done];
}

when holes are inlined this becomes

group seq {
  A[go] = cond & !(A[go] & some_reg.done);
}

which can't be inlined further because there's a cycle.

We use this in a few places, but maybe we should think hard about whether we really need to condition executing a group using it's done signal as a condition.

@sampsyo
Copy link
Contributor

sampsyo commented Oct 7, 2020

This seems like a strong argument to me! [done] = [go] is pretty sensible as a way to distinguish combinational groups.

@sgpthomas
Copy link
Collaborator Author

After a discussion with Adrian, it seems that we have a choice between making compile_control aware of combinational groups and having a separate, almost scheduling pass that removes combinational groups by packing them into a group that takes a single cycle. The second approach is advantageous for modular reasons. It allows compile_control to be a little simpler and means that "scheduling" happens in it's own pass. However, it creates an invariant that compile_control can not accept programs with combinational groups.

For the record, here's a sketch of how a combinational aware compile_control would work:

  • we change the compilation so that done signals are visible to the next group
  • we use !group[done] in the group[go] assignment to ensure that go and done are always disjoint
  • for combinational groups where this[done] = this[go], we latch the done signal and use that instead of using group[done] directly.
done_reg = group[done];
group[go] = ... & !done_reg;

and a sketch of the "latching" pass approach:

  • we change the compilation as before
  • for every combinational group we do the stupid thing of introducing a latch:
group comb {
  add.right = 1; add.left = 2;
  latch.in = comb[go];
  latch.write_en = !latch.done
  comb[done] = comb[go] & !latch.done;
  • additionally we can write a pass that tries to remove unnecessary latching by collapsing multiple combinational groups in sequence into a single combinational group.

After writing these out in detail, it seems clear to me that both options are not drastically different and the second option is slightly simpler to implement. It's basically a question of whether we do this transformation during compilation or in a separate pass before compilation.

@rachitnigam
Copy link
Contributor

Discussion moved to #270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High Blocking issues
Projects
None yet
Development

No branches or pull requests

4 participants