Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend dead-assign-removal to work with static groups #1655

Open
rachitnigam opened this issue Aug 11, 2023 · 3 comments
Open

Extend dead-assign-removal to work with static groups #1655

rachitnigam opened this issue Aug 11, 2023 · 3 comments
Labels
C: calyx-opt Optimization or analysis pass good first issue Good issue to start contributing on Calyx

Comments

@rachitnigam
Copy link
Contributor

dead-assign-removal gets rid of assignments to combinational cells that are never read from. The current pass only works for normal groups. Let's extend it to work on static groups. #1653 documents one particular effort that didn't work out.

The specific program is that static par, unlike par, allow groups to synchronize using timing information, making the following program valid:

import "primitives/compile.futil";
component main() -> () {
  cells {
    add = std_add(32);
    r = std_reg(32);
  }
  wires {
    static<1> group do_add {
      add.left = 32'd1;
      add.right = 32'd0;
    }
    static<1> group do_write {
      r.in = add.out;
      r.write_en = 1'd1;
    }
  }
  control {
    static par { do_write; do_add; }
  }
}

The dynamic version of this program is not valid.

Implementation Approach

Instead of focusing only on the groups, we need to traverse the control program and track all the combinational cells that are read from any group in a particular par context. Once that is done, we can go and remove assignments that are not read from any static group in any thread context. The challenging part is doing this in one pass: threads can create complex cascades or combinational chains where they assign to combinational cells. For example:

static par {
  do_add; // assigns to add which uses the output from sub
  do_sub; // assigns to sub which uses the output from lt
  do_lt; // assigns to lt which uses the output from register
}
@rachitnigam rachitnigam added C: Calyx Extension or change to the Calyx IL good first issue Good issue to start contributing on Calyx labels Aug 11, 2023
@rachitnigam
Copy link
Contributor Author

@calebmkim @paili0628 @sampsyo This seems like a good example of places where static disable optimizations that are otherwise easy to use in the dynamic context. The "free synchronization" semantics come at a cost: optimizations are less compositional because they need to be control sensitive.

A different argument that can be made here is that the only reason this optimization is "simple" in the dynamic case is because it relies on dynamic par-based programs defining this kind of synchronization as UB which means it would lead to unpredictable and hard-to-debug behavior.

Regardless, we should probably discuss this in the paper so that our arguments about the abstractions come off as a bit more nuanced.

@sampsyo
Copy link
Contributor

sampsyo commented Aug 17, 2023

Indeed; this is a good observation about the (compiler optimization) reason that the dynamic version of Calyx exists: by giving the programmer fewer guarantees, it lets the compiler make stronger assumptions that can potentially turn into better optimizations. That is the power of UB and why it is "actually a good thing" when used carefully.

One general principle this makes me think of is that, when doing optimizations, there is a fundamental question that needs to be answered differently in dynamic vs. static contexts: what is the ordering among group executions?

  • In dynamic land: It's a partial order (happens-before graph) where the edges come from control statements. Or, equivalently, a CFG.
  • In static land: It's more like a timeline, where we lay out the cycles and put an interval for each group on the timeline. To know whether a group comes before another one, see if the end cycle of group A is less than or equal to the start cycle of group B.

I think this difference is what underpins the way @calebmkim's sharing pass works for static stuff. It could also drive the way that this DCE pass works too. The algorithm could look like:

mark all assignments as dead
for each assignment s:
  g = the group containing s
  c = the combinational cell that s assigns into [abort if it's not combinational]
  for every group g' whose interval overlaps with g (including g itself):
    for each assignment s' in g':
      if s' reads from c:
        mark s as live
remove all assignments still marked as dead

Maybe that's the same as the current dead-assign-removal except that we look at every overlapping g' on the timeline, not just g itself.

@rachitnigam
Copy link
Contributor Author

Ah right! I forgot we have access to StaticParAnalysis which would give this information to us and enalbe optimizing these programs.

@paili0628 paili0628 linked a pull request Sep 11, 2023 that will close this issue
@paili0628 paili0628 removed a link to a pull request Sep 11, 2023
@rachitnigam rachitnigam added C: calyx-opt Optimization or analysis pass and removed C: Calyx Extension or change to the Calyx IL labels Dec 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: calyx-opt Optimization or analysis pass good first issue Good issue to start contributing on Calyx
Projects
None yet
Development

No branches or pull requests

2 participants