-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC][Sim] Add triggered simulation procedures #7676
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, thank you for working on this! Introducing triggered to core dialects might be controversial since it essentially represents behavioral constructs. I have couple questions:
- How does this relate do LLHD? LLHD is I think really good at this kind of representation and is more flexible, is it possible to promote LLHD to core dialect and use it for behavioral.
on_init
as an operation seems a bit weird to me. Also whenon_init
is provided to TriggeredSeqeneceOp it must be the first element, correct? it may be more reasonable to puton_init
as an attribute onTriggeredOp
.- TriggeredOp could capture values outside I think it's fairly easy to cause race conditions. If the two triggered ops are trigged at the same edge and one triggered op depend another triggered op results, what is expected behavior? Also I think there is a same problem as what we talked about seq.to_immutable. If a triggered op operand is a port, there is initialization ordering problem.
Really cool 😎! I'm wondering how this relates to |
Thank you both for your feedback, yet another time. Let's see if I can defend my design decisions - apologies if it is getting a bit longer:
I would argue that %init = sim.on_init
%isSimulation = sim.triggered () on (%init : !sim.trigger.init) tieoff [0 : i1] {
%true = hw.constant true
sim.yield_seq %true : i1
} : () -> i1 In SV this would become: logic isSimulation = 1'b0;
`ifndef SYNTHESIS
initial isSimulation <= 1'b1;
`endif I'm not saying that you should do that, but at least the difference clearly originates from a
I have to shamefully admit that I only have superficial knowledge of LLHD. But from what I have picked up so far, it is mostly aimed at event queue and time based simulation. It's great that we can do that if we must. But for frontends like FIRRTL, which don't really have a concept of time, it seems like overkill to me. For
So, I guess the body of a TriggeredOp is pretty much the same as a "function" in LLHD. Thinking of arcilator, I recon the difference between using Sim vs. LLHD would be like the difference between using
TriggeredOps simultaneously capture their argument at the occurrence of their root event. A chain of TriggeredOps on the same clock/event would behave like a shift register or a clocked pipeline. This is meant to avoid race conditions. If I ended up creating them, I did something wrong. 😬
Yes. It is frustrating but at least for SV I'm afraid we cannot avoid it. As I mentioned in the other PR, I think our best option here is some sort of interface contract, either encoded by type or by an attribute, promising that any initialization of the port has occurred before the
There is definitely a functional overlap with
If I understand you correctly you are suggesting to schedule operations via the topological order of their arguments and results, like @uenoku did for TL;DR: I like trees. 🌲 |
0ed5c45
to
27e6769
Compare
After letting this settle for a month I still think it is a viable approach (which, for me, is somewhat unusual 😅). So, let me nudge it out of draft mode. My condensed argument in favor of trigger trees instead of a token-flow approach would be that they structurally guarantee deadlock freedom even through opaque interfaces, while clinging to the familiar concept of clock trees. The only new concept that is added is I think it is worth highlighting that The problem of non-deterministic initialization order in SV remains a pain point. I don't know how much of that should bleed into the core dialects. At the moment my gut-feeling is to avoid being overly restrictive in the middle-end and then have a legalization/sanitizer pass try to convert it into deterministic behavior during SV lowering. I haven't gotten around writing a lowering for the new arc passes, yet. I've been eyeing the discussion on |
Starting to go through this. Is it simpler to say that the ordering of tasks is the in-order traversal of the tree from the root of trigger-sequences with multiple leaves on the same edge being unordered? |
Going by my nomenclature above leaves on the same edge would be "concurrent". "Unordered" would imply that they are not part of the same tree. But other than that, yes. In-order traversal always provides a legal order. Iff no trigger value has more than one user, it is the only legal order. |
27e6769
to
b03365d
Compare
include/circt/Dialect/Sim/SimOps.td
Outdated
Pure, | ||
DeclareOpInterfaceMethods<InferTypeOpInterface, ["inferReturnTypes"]> | ||
]> { | ||
let summary = "Invoke a trigger on a clock edge event."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: invoke -> create ?
include/circt/Dialect/Sim/SimOps.td
Outdated
def YieldSeqOp : SimOp<"yield_seq",[ | ||
Terminator, HasParent<"circt::sim::TriggeredOp"> | ||
]> { | ||
let summary = [{Yield results form a triggerd region with 'seq' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit form -> from
(i.e. register-like) semantics."}]; | ||
let description = [{ | ||
Terminates a triggered region and produces the given list of values. | ||
The results only become visible after all triggers and register updates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does register updates mean here? Registers in seq are not tied to these triggers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates to what I've tried to formulate in my comment above:
The exact conditions and time at which an event 'occurs' are determined by the simulation environment. The only requirement is that it is the same mechanism which is used to trigger the sampling and updating of registers.
That point is crucial to ensure we don't get any race conditions when mixing registers and the results of TriggeredOps. In practice this means for a a legal SV lowering the results must be produced via a non-blocking assignment that is evaluated at the same timestep and in the same scheduling region as registers updated on the clock that the given trigger is sensitive to.
^bb0(%arg0: i8): | ||
%cst1 = hw.constant 1 : i8 | ||
%inc = comb.add bin %arg0, %cst1 : i8 | ||
sim.yield_seq %inc : i8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yield significantly overlaps with registers. It makes sense to be able to get values out of triggered regions, but do we want trigger regions to have implicit storage?
%r = reg %v
%v = sim.triggered() on ... {
%inc = comb.add bin $r, const(1)
sim.yield %inc
}
performs the same thing without introducing redundant ways to store values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, it would make for tidier semantics. But I'm afraid the register-like behavior of yield
is required to not turn sim.triggered
into a massive footgun. My points of concern are:
- Since TriggeredOp only provides a defined value at the point(s) in time when the trigger's root event occurs, what would be the observed result values outside of these points. I.e., what happens if the clock of the trigger does not match the clock of the register? It would be possible to use the tieoff values here, but I'm not sure that's a good idea.
- What would be the appropriate clock for a register at the output of an
initial
ly triggered procedure? - Without an appropriately formed register at the output, a valid SV lowering may become next to impossible. We'd have to explicitly sequence/synchronize every single use of the result value with the producing procedure. How would we possibly do this with e.g., non-procedural continuous assignments?
include/circt/Dialect/Sim/SimOps.td
Outdated
For non-simulation flows the results are replaced by their tie-off values. | ||
}]; | ||
let arguments = (ins AnyTriggerType:$trigger, | ||
Optional<I1>:$condition, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the condition redundant with an if inside the body?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But without the condition on the outside, we'd have to loop back the results of the TriggeredOp to its arguments, so the if on the inside can switch between the old and the new value. It's basically the same discussion as having an explicit enable on a register vs. muxing the current value into the input. At this level of abstraction I'd prefer to avoid back-edges when possible.
Anyhow - I've removed the condition for now. But I intend to replace it with something else. See my comment below.
root event occurs. | ||
The body region must complete without 'consuming' simulation time. It | ||
may not run indefinitely or wait for any simulation event. It is allowed to | ||
have side-effects and produce results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does side-effect mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the million-dollar question. 😬
To quote myself from above:
Side-effects of operations in a triggered operation must not be observable by operations outside of the same operation, unless they are passed as a result.
That requirement is both confusingly phrased and overly conservative. Maybe a better way of thinking about this is to separate between the hardware model (i.e., anything written in hw
, comb
, seq
) and the environment (everything else, including the body of TriggeredOps, the simulator, the OS, etc.) The requirement would then be, that all information flow between the model and the environment has to be relayed either through the top-level IOs, or the results and arguments of sim
operations (maybe also inner symbols ?!?). At that point the body regions of TriggerdOps would be pretty much free to do whatever they want, unless it causes an effect that is observable from within the model.
But I cannot say I've nailed this down, yet.
may not run indefinitely or wait for any simulation event. It is allowed to | ||
have side-effects and produce results. | ||
For every result a 'tieoff' constant must be provided. It specifies the | ||
respective result's value before the body is first invoked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is a tie off different from an on_init trigger? This also seems to be a mechanism to make these sequences encode registers.
Can we have a clean separation of state and triggers or do we really need to go down the verilog style inferred registers path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tie-offs provide a pre-initial value, i.e., the value an on_init
triggered procedure observes when looking at its own results. They are closely coupled to the "initialization at declaration" in SV (Section 10.5 of the spec). The tie-off and implied state make sure we have a defined value for the results at any point of execution. IMHO that's the lesser of evils.
@@ -40,6 +41,12 @@ Operation *SimDialect::materializeConstant(::mlir::OpBuilder &builder, | |||
::mlir::Type type, | |||
::mlir::Location loc) { | |||
|
|||
// Delegate non 'sim' types to the HW dialect materializer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this is necessary? Are hw constant requests really winding up here? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary materialize the tie-off integer attributes of TriggeredOps as hw.constant
when they get folded.
I've just removed the fold method for the moment, as it relied on the condition
argument. But I'd keep this section for future use, if you don't mind.
// leave it to the parent's canonicalization. | ||
if (auto parentSeq = op.getParent().getDefiningOp<TriggerSequenceOp>()) { | ||
if (parentSeq == op) { | ||
op.emitWarning("Recursive trigger sequence."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a failure? Shouldn't it be covered by the verifier? What about multi-op sequences which are circular?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice a cyclic trigger graph is almost certainly a bug. But theoretically it is still well-formed: It has no root event, thus it is never invoked. There is no good way of handling this right now, but that should change with the soon to be added sim.never
op, which just creates a constant "dead" trigger.
Multi-op circular graphs will be opportunistically handled by the flattening DFS. I'm not sure it is worth spending any effort on making sure all cycles are detected and replaced.
if (!canBeChanged) | ||
return failure(); | ||
|
||
// DFS for inlinable values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done instead on an op-by-op canonicalizer which doesn't walk the IR? having DFS in a canonicalizer is a good way to have n^2 behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how I've implemented it originally, but I think the current variant is more efficient. Note that:
- The first thing the canonicalizer does it to check, whether the current op can be inlined by the parent. If so, it bails out immediately. That way we can guarantee that the DFS only is performed from the root of a collapsible sub-tree and the entire sub-tree is collapsed with a single op rewrite.
- The DFS will only enter a child node if it will be inlined. Thus, it should not do any more traversal than an op-by-op canonicalizer would also need to do.
- The DFS itself is pretty cheap. Since all nodes are guaranteed to have a single predecessor we do not need to maintain a set of visited nodes to prevent walking into a cycle.
My motivation to replace the original implementation was to minimize the number of op rewrites. Since a TriggerSequenceOp can have a lot of results and users, I didn't want to substitute them repeatedly. The DFS makes sure this happens in one shot. Given all this, I do not see how this would cause a performance regression. But maybe I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is on a good track. My biggest concern is the implicit register representation being redundant with seq.
Minor questions about n^2 behavior in canonicalization.
Flatten nested sequences in a single rewrite rather than iteratively.
ad81565
to
54a2f26
Compare
Thanks a lot for your review. I hope, I could provide a somewhat convincing motivation for my decisions in the comments. I have removed the condition argument for TriggeredOps for the moment. My intent is to replace them with the |
include "mlir/Interfaces/SideEffectInterfaces.td" | ||
include "mlir/Interfaces/FunctionInterfaces.td" | ||
include "circt/Dialect/Sim/SimDialect.td" | ||
include "circt/Dialect/Sim/SimTypes.td" | ||
include "circt/Dialect/Seq/SeqTypes.td" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicates
let summary = [{Yield results from a triggerd region with 'seq' | ||
(i.e. register-like) semantics."}]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra "
at the end. Summary should probably also be short enough to fit in one 80 col line. Everything else can go in 'description'.
def EdgeTriggerType : SimTypeDef<"EdgeTrigger"> { | ||
let summary = "Trigger derived from an edge event."; | ||
let parameters = (ins "::circt::hw::EventControl":$edgeEvent); | ||
let mnemonic = "trigger.edge"; | ||
let assemblyFormat = "`<` $edgeEvent `>`"; | ||
} | ||
|
||
def InitTriggerType : SimTypeDef<"InitTrigger"> { | ||
let summary = "Trigger derived from the simulation start event."; | ||
let mnemonic = "trigger.init"; | ||
} | ||
|
||
def AnyTriggerType : AnyTypeOf<[EdgeTriggerType, InitTriggerType]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need separate types for init and edge and why does edge need an event control attr? Why would a procedural region care by which trigger it was invoked?
} | ||
|
||
def TriggeredOp : SimOp<"triggered", [ | ||
IsolatedFromAbove, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering about why you decided to isolate it from above and pass through inputs explicitly?
Continuing the series of #7314 and #7335 (and hoping to finally get to lower the
sim.proc.print
operation) this PR adds trigger-related types and operations to the Sim Dialect. The primary point is to be able to express the execution order of side-effecting ops and procedures without having to rely on operation order within a HWModule's graph region. As added benefits, triggers allow us to:Triggers span virtual clock trees. Their root node is either an edge event of a "real" clock (
sim.on_edge
) or the start of simulation (sim.on_init
). When the root event occurs, all leaf operations of the given tree are triggered. In contrast to normal clock trees, trigger trees impose a partial order on their leaf nodes from which we can derive their execution order. Two leaf nodes are unordered (incomparable) if they are not part of the same trigger tree. They are concurrent (equal) if their lowest common ancestor operation is not aTriggerSequenceOp
. If the lowest common ancestor is aTriggerSequenceOp
the order depends on the result indices of the sequence op.So... in practice:
sim.triggered
provides a region in which we can place procedural operations. These operations can have side-effects. However, they are required to make forward progress and eventually terminate independently of all other procedures and simulation events. This means that concurrent procedures are not actually required to be run in parallel. Any chosen serialization should be legal / dead-lock free. Note that during lowering previously unordered procedures can become concurrent, e.g., by CSEing their root triggers.TriggeredOps can also produce results via the
sim.yield_seq
terminator. The "seq" is to indicate an implicit register at the output. I.e., results are produced in a clock synchronous fashion. At some point we"ll probably need an asynchronoussim.yield_comb
. But this can create all sorts of complex interactions, so I try to put it off as long as I can 😅.All results of
sim.triggered
must have an explicit tie-off constant specified. These are used both as results outside of simulation contexts (i.e., synthesis), and as (pre)initial value of the implicit register.I have a very much proof-of-conceptish implementation of an arcilator lowering in my github fork. It can compile this little gadget, showing how to do sequenced calls to a side-effecting procedure during initialization across module instances (and print stuff).