Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move beu interrupt crossing source register into Tile #2623

Closed
wants to merge 1 commit into from

Conversation

ernie-sifive
Copy link
Contributor

@ernie-sifive ernie-sifive commented Aug 30, 2020

Related issue:

Type of change: bug report

Impact: no functional change

Development Phase: implementation

Release Notes

A DFT error in the Tile interrupt output (from BEU) was detected. The problem is the Diplomatic interrupt crossing uses an AsyncReset flop on the source side of the crossing and this flop gets its reset from the default Chisel reset in effect (which is the stretched core_reset input in an RA or RF system). DFT rules prevent any internally-generated signals from being used as async reset. This PR removes the flop from the crossing itself and puts it into the Tile, where it can get its reset from the raw core_reset.

An IntAdapterNode was added to allow inserting a flop in the interrupt path within RocketTile. Default Diplomatic crossing was changed to use alreadyRegistered=true which prevents a register from being inserted on the source side of the crossing.

@aswaterman
Copy link
Member

I agree with the technical solution but would like @hcook to review the code.

@hcook
Copy link
Member

hcook commented Aug 31, 2020

No strong objection to the code as written but two high level concerns:

  1. is this going to be a recurring problem for all devices that are ever added to a tile that generate interrupts? if so, we probably need a more generic InterruptAdapter module that applies this fix to any number of interrupt edges, not just the 0th interrupt of the 0th device, and gets instantiated with its node then defined as val intOutwardNode in RocketTile. Then we wont need a new node per device and if more interrupts are added they will all get handled the same way.
  2. Given this change, it now seems weird to me that the "the default Chisel reset" inside TilePRCIDomain is the stretched reset and not the raw reset. I might have implemented things that way the first time around, but is that really what we want? Needing this change would suggest it was not... I am worried about all the other cross products of protocol x crossing type. Are there other crossing that will want the tile half of the crossing to have the raw reset not the stretched one?

Furthermore, it looks like @aswaterman and I both missed the addition of a separate clock sink node port for handing the raw reset and driving some certain tile ports using it... It suggests to me that this single "PRCI domain wrapper" is not a good abstraction, since there are apparently actually two reset domains propagating through subcomponents of the tile, both of which are being used? I feel like we might need to step back and examine which components are really expected to be on which reset domain and whether the current Scala code representation of single domain wrapper is suitable...

@hcook
Copy link
Member

hcook commented Aug 31, 2020

It seems to me like we want all the registers that want the unstretched reset to live in an outer wrapper (either TilePRCIDomain rewritten to not use the stretched reset for anything but driving the tile's reset? or a second ClockDomain wrapper added outside the first that doesn't contain the stretcher but doest contain the clock crossings?). Like, the notifications aren't "clock crossed" right now, but they need to be "reset crossed" to have this BlockDuringReset register added in the right place hierarchically?

@ernie-sifive
Copy link
Contributor Author

@hcook I'm all for a more general solution. I double-checked and the reset feeding the intsource block (containing the AsyncReset register for the source side of the crossing) is indeed fed by the output of the reset stretcher.

  assign intsource_clock = rs_auto_reset_stretcher_out_clock;
  assign intsource_reset = rs_auto_reset_stretcher_out_reset;
  assign intsource_auto_in_0 = tile_auto_int_outward_out_0; // @[LazyModule.scala 301:16]   

It seems the general need here is for any outputs sourced from the Tile to get their reset from the external input rather than the stretcher and also to include a BlockDuringReset so that they remain inactive through the first clock edge and don't falsely assert at that point.

@hcook
Copy link
Member

hcook commented Sep 1, 2020

@ernie-sifive Yeah, the TilePRCIDomain has clockSinkNode which drives all child LazyModule clocks (including crossing halves), which right now gets driven like

domain.clockSinkNode := crossingParams.injectClockNode(context) := domain.clockNode

and thus whatever get injected applies itself to all crossing halves, both inputs and outputs, and also the tile. But it seems like we want to distinguish reset strategy between either the crossing halves(/output registers) and the tile contents, or alternatively between all children explicitly on a case by case basis. Put another way, if certain clock crossings have registers that are driven by inputs to the tile, which reset domain should those registers be a part of? Stretched or no? Does it matter? If they can/should all be unstretched, then I think we should figure out how to make crossingParams.injectClockNode(context) apply to the tile child module only.
Then we could add a little more support for plopping down BlockDuringResets on any BundleBridges that go through the TilePRCIDomain wrapper (probably will look something like how the clock crossings get added), and everything in TilePRCIDomain other than the tile will get tile clock but unstretched reset and everything in tile will get stretched reset.

If we go down this path, adding a whole separate module layer via another ClockDomain to add the stretcher to the tile is probably the most expedient in terms of source code changes, and it has a clear separation of concerns like

ClockDomain {
  ResetDomain { tile } := stretcher
  instantiate crossing halves, blockDuringResets
}

but it also seems a little egregious to add a whole other layer if the only child with stretched reset really is the tile... We could probably figure out how to do it all flat in one wrapper layer too, with a judicious use of something like a domain { InModuleBody { tile.module.reset := params.chisetResetFunction() } }

@ernie-sifive
Copy link
Contributor Author

@hcook I think we can live with tile input registers getting their reset from either the external or the stretched reset.

In an async reset system, everything outside gets reset first, then that reset ends, and a while later, the clock starts and the Tile is reset. Things crossing out of the Tile need the async reset so that they are quiescent before and on the first clock edge when whatever they are driving wakes up. The BlockDuringReset ensures that the async reset output registers themselves don't act on randomized outputs from the core on the first clock edge, before the core is reset.

For inputs crossing into the Tile, at some point, the input crosses from an async reset domain to the Tile's sync reset domain, and I don't think it matters whether the reset domain crossing occurs between the two halves of the crossing logic or between the crossing sink and the rest of the Tile. Interrupts aren't going to be acted upon, and the Rational Crossing's BlockDuringReset will keep it quiet until both sides have emerged from reset.

@hcook
Copy link
Member

hcook commented Sep 8, 2020

@ernie-sifive Apologies fro the delay, but I'm going to take a crack at the generalization of this today. I will keep you posted and submit my changes as a PR against this PR.

@hcook
Copy link
Member

hcook commented Sep 8, 2020

@ernie-sifive As I fiddle around, some follow-up questions I have to this change:

  • Would this change still be necessary if the rational crossing sink inside TilePRCIDomain were getting its reset from the raw reset instead of the stretched reset? Would changing it back to raw reset be a sufficient solution on its own? Or would we still have to block valid into the tile child for the duration of stretched reset?
  • If the goal is actually to reset the two side of the RationalCrossing independently from one another, are we sure that this change is sufficient for correct operation of the crossing thereafter? I don't believe the crossing was designed to support that use case, in general. Definitely overlooked that during the first round of implementation, sorry...
  • Why doesn't the enq.ready output of the RationalCrossingSource need the same treatment? Is it just that nothing right now happens to be monitoring that ready signal and asserting if it goes high on that first cycle?
  • Should the other "clock crossing" type widgets like AsyncQueueSink or Queue do the same thing to their output control signals? Or is the problematic behavior somehow covered by their implementations already? (e.g. I know AsyncQueue already supports both sides being reset independently.)
  • Could we make a separate valid-squashing adapter that can be put next to any clock crossing adapter, or is this an innate problem with RationalCrossing specifically?

@ernie-sifive
Copy link
Contributor Author

ernie-sifive commented Sep 8, 2020

@hcook

  • For rational crossings out of the tile, the rational crossing source is in PRCI (stretched reset) and the sink is in SystemBus (async reset). Changing the source side in PRCI to async reset should solve the problem, I think, without needing the BlockDuringReset change I made to the sink side. There is already a BlockDuringReset on the source side to keep it from acting on random signal values coming from the Tile prior to the first clock edge.
  • The goal wasn't to reset the two sides separately, but to accommodate an async reset on one side and sync reset on the other. Async reset ends before the first clock edge on the sync reset side resets that side. I would agree that resetting both sides the same way is likely a better solution and is what you are already proposing for the IntCrossings.
  • Other crossing types such as AsyncCrossing might need a similar BlockDuringReset treatment -- keep the source from acting on random outputs from the Tile before the first clock edge.
  • A general-purpose adapter that does the BlockDuringReset function seems like a good idea. This could be put in front of the IntCrossingSource being used for Tile notifications, too.

@hcook
Copy link
Member

hcook commented Nov 18, 2020

Closed in favor of #2641

@hcook hcook closed this Nov 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants