Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #800 and #814 #825

Closed
wants to merge 3 commits into from
Closed

Fix #800 and #814 #825

wants to merge 3 commits into from

Conversation

davidmallasen
Copy link

@davidmallasen davidmallasen commented Mar 31, 2023

@davideschiavone and @Silabs-ArjanB I tried to fix #800 and #814. Here is my attempt, let me know if it also works for you guys.

@davideschiavone davideschiavone added the Component:RTL For issues in the RTL (e.g. for files in the rtl directory) label Mar 31, 2023
davidmallasen added a commit to davidmallasen/x-heep that referenced this pull request Apr 3, 2023
rtl/cv32e40x_controller.sv Outdated Show resolved Hide resolved
@@ -1492,7 +1494,7 @@ module cv32e40x_controller_fsm import cv32e40x_pkg::*;
commit_valid_q <= 1'b0;
commit_kill_q <= 1'b0;
end else begin
if ((ex_valid_i && wb_ready_i) || ctrl_fsm_o.kill_ex) begin
if ((ex_valid_i && wb_ready_i) || ctrl_fsm_o.kill_ex || (xif_mem_if.mem_valid && xif_mem_if.mem_ready)) begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intention of this code was to not clear the flags until the offloaded instruction (which is also tracked by the pipeline) leaves EX, or the EX stage is killed. With this change, the flag may be cleared early, even when WB is not ready to accept a new instruction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That commit_valid_q signal is negated in the xif_commit_if.commit_valid condition, so if you don't clear it in this case, you get a deadlock (see the red signals and the XIF signals):
commit_valid

rtl/cv32e40x_ex_stage.sv Outdated Show resolved Hide resolved
@@ -683,6 +684,10 @@ module cv32e40x_id_stage import cv32e40x_pkg::*;

end else if (ex_ready_i) begin
id_ex_pipe_o.instr_valid <= 1'b0;
id_ex_pipe_o.xif_en <= 1'b0;
end else if (xif_mem_if.mem_valid && xif_mem_if.mem_ready) begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in the code, id_valid==0 and ex_ready == 0. This implies that the EX stage is not yet ready to accept a new instruction (id_ex_pipe_o must be kept stable), but the added code will insert a bubble into EX. What would happen if the offloaded load/store (in EX) must stay in EX for multiple cycles while waiting for WB to become ready (load/store waiting for data_rvalid_i for instance)? Then the EX stage would be cleared by the inserted bubble, and the offloaded instruction would no longer be tracked by the pipeline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LSU which drives the mem_ready signal is essential split between EX (address phase) and WB (response phase)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that case in particular. However, without this the issue in #800 occurs. Maybe there is a better fix but we couldn't find it.

rtl/cv32e40x_id_stage.sv Outdated Show resolved Hide resolved
@silabs-oysteink
Copy link
Contributor

Please see my review comments.

Also, please file one PR per issue if possible.

@davidmallasen
Copy link
Author

Thank you @silabs-oysteink . I replied to your comments and applied some changes. I cannot seem to find a barrier between these two issues (they must be linked) and this PR fixes them both.

@davideschiavone
Copy link
Contributor

@silabs-oysteink what is the status of this PR?

@silabs-oysteink
Copy link
Contributor

@davideschiavone
I still think it would be nice to have one PR per issue for easier tracking.

For the change below, I am really hesitant to merge it for the following reasons:

  • It seems like this can basically kill the instruction in EX by clearing id_ex_pipe.instr_valid (even when EX is not ready for new inputs), essentially inserting a bubble.
  • What would happen if the offloaded instruction causes multiple memory transactions? I guess the clearing of id_ex_pipe.instr_valid would cause issues on the next memory request as the EX(or EX part of the LSU) wouldn't know that it is tracking a offloaded instruction (id_ex_pipe.instr_valid == 0).
  • The WB stage is qualifying ex_wb_pipe.xif_en with ex_wb_pipe.instr_valid in multiple places. Wouldn't the mentioned clearing of id_ex_pipe.instr_valid propagate to the WB stage and cause loss of handshakes from WB to the coprocessor?
    image

In issue #800 @Silabs-ArjanB mentioned that the root cause seemed to be lack of backpressure from WB to the LSU. It seems like a fix would need to handle this and probably insert some sort of skidbuffer to ensure no data is lost.

If you want, we can arrange a call to discuss this PR further.

@davidmallasen
Copy link
Author

To help with this debugging process, I set up an environment integrating FPNew and cv32e40x with the XIF using X-HEEP. I also have a test application that fails to execute properly. You can find the repo branch here: https://github.com/davidmallasen/F-HEEP/tree/bad-xif

I left some instructions and what to expect from the output here: davidmallasen/F-HEEP#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component:RTL For issues in the RTL (e.g. for files in the rtl directory)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[XIF] Redundant mem_result when using CORE-V-XIF
3 participants