ME reads the wrong stub in Vivado simulation #188

meisonlikesicecream · 2021-08-06T14:49:02Z

The remaining errors in the PRMEMC chain is caused by the ME. It seems like the ME reads the wrong VMSME stubs if the previous event continues reading from the VMSME memories after the current event has started (thanks to rewind?). We don’t see this issue in the HLS simulation as it adds a few clock cycles in between events for some reason.

I have attached a screenshot of the waveforms for one of the inconsistent events (event 27 in ME_L3PHIC19) to try and demonstrate this:
The event starts where the blue vertical marker is positioned.
The read address for the VMPROJ memories (red row) are correct and two clock cycles later, marked by the yellow vertical marker, it reads the correct VMPROJ stub (magenta row).
In that same clock cycle (yellow vertical marker) the ME then uses this VMPROJ stub to figure out the read address for the VMSME stub (yellow rows; I have split it up into BX, BIN, STUB NUMBER = which stub we read within a bin). This address is incorrect. The BX and BIN bits are correct, but it should start with the 0th stub and continue up to the number of entries within that bin. In the events when it fails, STUB NUMBER is non-zero (7 in this case) and we end up reading the wrong ME stub. Instead of resetting the STUB NUMBER to 0 when reading from a new bin, it increments the stub number from the previous event as it had more than 7 stubs in that bin.
This then leads to the wrong VMSME stub being read, which in turn leads to a missing stub in the output memory.

There is an ugly bodge to avoid this issue by not allowing the ME to read VMSME stubs after clock cycle 106 ( i.e. add if (bufferNotEmpty && istep<kMaxProc-2) { to this line: https://github.com/cms-L1TK/firmware-hls/blob/master/TrackletAlgorithm/MatchEngine.cc#L140), but then it would of course truncate some stubs further down the chain unless we change the testvectors. Don't know if that bodge only works for our specific PRMEMC chain or in general.

I have added the ME VHDL testbench in my repo and it can be run using the same instructions as for the PRMEMC in the README for debugging purposes: https://github.com/meisonlikesicecream/firmware-hls/tree/me_vhdl_tb/IntegrationTests/ME

The text was updated successfully, but these errors were encountered:

aehart · 2021-08-06T15:03:07Z

Thanks for revisiting this @meisonlikesicecream. I at least had lost track of this issue.

I still wonder if this is related to the following warning that we get during C-synthesis of the ME:

WARNING: [SCHED 204-63] Unable to schedule the whole 3 cycles 'load' operation ('projectiondatatmp.data_.V', ../TrackletAlgorithm/MemoryTemplate.h:42->../TrackletAlgorithm/MatchEngine.cc:101->../TrackletAlgorithm/MatchEngine.cc:239) on array 'inputProjectionData_dataarray_data_V' within the first 0 cycles (II = 1).
Please consider 1) removing the 'rewind' option from the pipeline, or 2) increasing the target initiation interval of the pipeline.

which I think is related to this warning that we get during C/RTL cosimulation:

WARNING: [HLS 200-626] This design is unable to schedule all read ports in the first II cycle. The RTL testbench may treat the design as non-pipelined

If these warning are relevant, maybe we will have to introduce some gap between events similar to what HLS does in the cosimulation.

meisonlikesicecream · 2021-08-06T15:21:25Z

Thanks for revisiting this @meisonlikesicecream. I at least had lost track of this issue.

I still wonder if this is related to the following warning that we get during C-synthesis of the ME:
WARNING: [SCHED 204-63] Unable to schedule the whole 3 cycles 'load' operation ('projectiondatatmp.data_.V', ../TrackletAlgorithm/MemoryTemplate.h:42->../TrackletAlgorithm/MatchEngine.cc:101->../TrackletAlgorithm/MatchEngine.cc:239) on array 'inputProjectionData_dataarray_data_V' within the first 0 cycles (II = 1).
Please consider 1) removing the 'rewind' option from the pipeline, or 2) increasing the target initiation interval of the pipeline.
which I think is related to this warning that we get during C/RTL cosimulation:
WARNING: [HLS 200-626] This design is unable to schedule all read ports in the first II cycle. The RTL testbench may treat the design as non-pipelined
If these warning are relevant, maybe we will have to introduce some gap between events similar to what HLS does in the cosimulation.

Hmm maybe? Although I get the same warnings from the VMR and that one seems to do alright?

aehart · 2021-08-09T12:59:43Z

Hmm maybe? Although I get the same warnings from the VMR and that one seems to do alright?

Do we know that though? The ME seems to fail when one event uses all processing iterations and the following event produces output from the first (few?) inputs. Have we confirmed that the other modules with these warnings, which I think might be all of them, succeed in these situations?

meisonlikesicecream · 2021-08-09T15:48:43Z

Hmm maybe? Although I get the same warnings from the VMR and that one seems to do alright?

Do we know that though? The ME seems to fail when one event uses all processing iterations and the following event produces output from the first (few?) inputs. Have we confirmed that the other modules with these warnings, which I think might be all of them, succeed in these situations?

Good question. I just tested a single VMR_L2PHIC in Vivado as it uses all processing iterations for some events (unlike the one used for the IRVMR tb), and something is definitely wrong: the 108th stub in the output is always missing. Will look into it.

meisonlikesicecream · 2021-08-11T10:10:13Z

Good question. I just tested a single VMR_L2PHIC in Vivado as it uses all processing iterations for some events (unlike the one used for the IRVMR tb), and something is definitely wrong: the 108th stub in the output is always missing. Will look into it.

Update: Turns out the missing 108th stub were due to some bugs in the FileWriterFromRAM, which are unrelated to this. Although there are still inconsistencies for that specific VMR which might or might not have to do something with those HLS warnings... I'll keep investigating.

aperloff · 2021-08-12T14:01:08Z

Is there a reason why it takes 3 clock cycles to load a projection from memory? I thought that would take 1-2 clock cycles. Is this normal?

meisonlikesicecream · 2021-08-17T15:34:09Z

Update: Turns out the missing 108th stub were due to some bugs in the FileWriterFromRAM, which are unrelated to this. Although there are still inconsistencies for that specific VMR which might or might not have to do something with those HLS warnings... I'll keep investigating.

I can't figure out what's wrong with the remaining VMR_L2PHIC inconsistencies... for some reason the VMR IP sends out the wrong write address for some output stubs. The actual output stub and nentries look good, so does the input stubs. It is more likely to happen when it processes 108 events, but it does happen when we have less than 108 as well. It also seems to happen relatively close to either the start of a new event or in the end. If this is somewhat related to the incorrect read addresses of the ME, I don't know.

The weird 3 clock cycle warning disappears if one removes the HLS RESOURCE pragma (or change the latency from 2 to 1), but then the VMR fails cosim...

meisonlikesicecream · 2021-10-22T15:00:59Z

Okay, I think I have made some progress regarding this. So it seems like the issue might be related to resetting the variables that keep track of the read address (for the ME) and write address (for the VMR) at the start of the main function. Currently, the reset is being performed before the main loop and Vivado simulation doesn't seem to like that? (I'm referring to line https://github.com/cms-L1TK/firmware-hls/blob/master/TrackletAlgorithm/MatchEngine.h#L208 and https://github.com/cms-L1TK/firmware-hls/blob/master/TrackletAlgorithm/VMRouter.h#L653-L664)

Maybe this is related to the line in the HLS rewind documentation stating:
The code segment before the loop:
Is considered as initialization.
Is executed only once in the pipeline.
? I have no idea.

If one instead resets the read/write address variables inside the loop it seems to work better... see:
https://github.com/cms-L1TK/firmware-hls/blob/vmr_vhdl_debug/TrackletAlgorithm/MatchEngine.h#L304-L306
https://github.com/cms-L1TK/firmware-hls/blob/vmr_vhdl_debug/TrackletAlgorithm/VMRouter.h#L662-L678

The changes in the ME reduced the number of errors in the PRMEMC chain from 3 to 1. I don't know what the last error is though (it's not a new one, it's one of the 3 errors that are already there). Will have to investigate that further.
The changes in the VMR seems to work as well, however, I have to sacrifice one loop iteration or it didn't pipeline properly. Maybe I can find some way of fixing this as well, or the VMR will only be able to process 107 stubs...

pwittich · 2024-04-04T14:54:33Z

is this issue still open? @meisonlikesicecream @aehart or can we close it?

meisonlikesicecream · 2024-04-04T15:01:13Z

I guess this is irrelevant now since we're using the combined modules :-)

meisonlikesicecream assigned aperloff Aug 6, 2021

This was referenced Dec 13, 2021

Reduce some VHDL top-level simulation errors #219

Merged

Update VMR maxstep cms-L1TK/cmssw#119

Merged

pwittich closed this as completed Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ME reads the wrong stub in Vivado simulation #188

ME reads the wrong stub in Vivado simulation #188

meisonlikesicecream commented Aug 6, 2021

aehart commented Aug 6, 2021

meisonlikesicecream commented Aug 6, 2021

aehart commented Aug 9, 2021

meisonlikesicecream commented Aug 9, 2021 •

edited

Loading

meisonlikesicecream commented Aug 11, 2021 •

edited

Loading

aperloff commented Aug 12, 2021

meisonlikesicecream commented Aug 17, 2021

meisonlikesicecream commented Oct 22, 2021 •

edited

Loading

pwittich commented Apr 4, 2024

meisonlikesicecream commented Apr 4, 2024

ME reads the wrong stub in Vivado simulation #188

ME reads the wrong stub in Vivado simulation #188

Comments

meisonlikesicecream commented Aug 6, 2021

aehart commented Aug 6, 2021

meisonlikesicecream commented Aug 6, 2021

aehart commented Aug 9, 2021

meisonlikesicecream commented Aug 9, 2021 • edited Loading

meisonlikesicecream commented Aug 11, 2021 • edited Loading

aperloff commented Aug 12, 2021

meisonlikesicecream commented Aug 17, 2021

meisonlikesicecream commented Oct 22, 2021 • edited Loading

pwittich commented Apr 4, 2024

meisonlikesicecream commented Apr 4, 2024

meisonlikesicecream commented Aug 9, 2021 •

edited

Loading

meisonlikesicecream commented Aug 11, 2021 •

edited

Loading

meisonlikesicecream commented Oct 22, 2021 •

edited

Loading