Working Calyx Implementation of AXI Read channels #1820

nathanielnrn · 2023-12-21T03:10:14Z

This PR adds a working implementation of the read portion of an AXI interface according to the latest spec and RTL kernel specifications for XRT.

Currently the tests are a bit of a pain to run, requiring some replacement of the verilog code generated. To this end here is a zipped .fst showing the waveforms of the three tests, in case anyone is interested.

The calyx is heavily commented, so hopefully not too bad to go through. However I know this is a lot so happy to hop on a call sometime to go over things.

Some thoughts about this so far:

There are definitely some inefficiencies in our interface. The immediate one that comes
to mind is that handshakes require 2 cycles instead of one. Servicing reads (writing them to the seq_mem_d1 in the calyx program takes ~9 cycles. And we also utilize a data_bus only as wide as our memory (32 bits for an eventual integer dot-product test case).
Some of the grouping we were initially trying to have, a blocking group, a servicing group, an assert and deassert group, had to be combined together due to the time it takes to transfer between groups. @sampsyo and I spoke very broadly about there possibly being future work here to enable faster transitions between groups (1 cycle as opposed to 2 right now IIRC).

These will be added to #1733 and tracked there but for now some TODOs:

Make cocotb tests interface with runt
Create tests that perform more than 1 transaction
Work on write channels.

Hopefully this last one will be a bit more straightforward now that there is some infrastructure in place and working channels to pattern match off of.

cc @andrewb1999 @rachitnigam @sampsyo

sampsyo · 2023-12-21T14:00:11Z

Awesome!!!!! Really cool that this is minimally working for the read side.

To slightly expand on this:

Some of the grouping we were initially trying to have, a blocking group, a servicing group, an assert and deassert group, had to be combined together due to the time it takes to transfer between groups.

I actually think this is a super interesting problem that has implications for static/dynamic Calyx assignments. Artistically, we would like to be able to manage an AXI transaction with a control program like this:

while <more transactions to do> {
  seq {
    assert_ready;  // tell the remote side we are ready for their data
    wait_for_valid;  // wait for them to send us stuff
    par {
      do_transfer;  // actually do things the data on the bus
      deassert_ready;  // tell them to wait while we process the data
    }
  }
}

The problem is that AXI, of course, is super cycle-timing-sensitive. Remember that one of the core invariants is:

On any cycle when both ready=1 and valid=1, a transaction occurs.

So the remote side can schedule these transaction cycles back-to-back, i.e., 3 transactions can happen in exactly 3 cycles. Most saliently for us, say the remote side just keeps its valid signal high all the time and relies on us to assert/deassert ready to do transfers. Then, if we need to take some time to process a transfer, we have to be absolutely sure we only assert ready for one cycle at a time.

The problem with dynamic Calyx is that it offers no guarantees about how long it takes to go from wait_for_valid above to deassert_ready. And in practice, it actually does waste at least one cycle, meaning that the remote side sees us do two transfers when we only wanted one.

The big upshot of static Calyx is that it does offer these guarantees, so we would like to be able to just sprinkle the static everywhere get the timing we want. There is a critical problem, however: the wait_for_valid group is fundamentally dynamic. Our "dynamic islands" approach does not support sequencing a dynamic group with a static one with a zero-cycle transition latency.

So the big design question that I think this opens up is: is there some rule we can make up that would allow this kind of sequencing? That is, to allow some version of seq { a ; b } where a is dynamic and b is static where we can guarantee that b runs in the same cycle where a says it is done. It seems hard to do in general but maybe there is something we can do for restricted cases?

sampsyo

This all looks great. For others reading this, the current thing is a hand-written, standalone Calyx program; getting this tested and working is a stepping stone toward "wiring it up" to core Calyx logic.

What do you think about renaming the top-level directory from fpgas to yxi to reflect the broader scope of work here?

TBD if this actually implements AXI correctly. There are currently some hacks in place (marked with todos) to get this to compile, namely some splicing that doesn't consider what we actually want to splice (it just takes [31:0]) as opposed to dynamically considering actual bits we want. A few other things that should be cleaned up eventually Need to create a cocotb testbench to test correctness

Maybe this shouldn't be here, but for now (having deleted my working directory earlier) putting it here

Simply run make from the cocotb directory and axi-read-tests will be executed

We tie ARID low in our manager

Prefixes should not contain trailing "_"

Got rid of "assert_val" and "block_transfer" groups and instead perform these things inside "do_ar_transfer", this is required because we cant assert valid before we drive the data correctly, so needs to happen in parallel. Currently: This seems to write 16 times to same place, this is due to hardcoding of 16 in ar transfer, not sure why address doesn't increment this is tbd (and next TODO)

This is part of read channel control sequence

Also reduces data bus width to 32

rachitnigam · 2023-12-26T17:29:13Z

@sampsyo Interesting point! Do you want to move this into an issue? Seems like there is some level of deeper semantics tweaking needed to make this work because we'd need to think about providing timing guarantees in a dynamic context.

sampsyo · 2024-01-04T17:24:21Z

FYI, I think this is good to merge whenever you're ready, @nathanielnrn!

* init commit of hardcoded axi wrapper for a 'main' kernel * add axi-reads-calyx * hook up inputs to channels in the wrapper. tbd if this works * Working calyx verison of AR and R TBD if this actually implements AXI correctly. There are currently some hacks in place (marked with todos) to get this to compile, namely some splicing that doesn't consider what we actually want to splice (it just takes [31:0]) as opposed to dynamically considering actual bits we want. A few other things that should be cleaned up eventually Need to create a cocotb testbench to test correctness * Track output of compiled calyx read channel Maybe this shouldn't be here, but for now (having deleted my working directory earlier) putting it here * update gitignore to get rid of sim_build and other cocotb artifacts * Working make files for running cocotb tests Simply run make from the cocotb directory and axi-read-tests will be executed * Add xID signals for cocotb compatability We tie ARID low in our manager * Fix prefix issue on cocotb axi test bench Prefixes should not contain trailing "_" * commit to repro 'make WAVES=1' cocotb error from axi-reads-calyx.futil * axi-reads patch * sync debug * Add txn_len initialization to 16 in calyx program * AXI Read fixed to get to read channel start Got rid of "assert_val" and "block_transfer" groups and instead perform these things inside "do_ar_transfer", this is required because we cant assert valid before we drive the data correctly, so needs to happen in parallel. Currently: This seems to write 16 times to same place, this is due to hardcoding of 16 in ar transfer, not sure why address doesn't increment this is tbd (and next TODO) * Add integer byte conversion for tests on Calyx AXI testharness * WIP get reads to work. Add incr_curr_addr group This is part of read channel control sequence * remove .fst from tracking * Add more data to testbench to make waveform viewing easier * Reads seem to be terminating correctly at RLAST * AR transfers seem to work, valid is high for 1 cycle * Unreduced axi-reads-calyx.futil Also reduces data bus width to 32 * Cocotb testbench now passes * Formatted and passing axi-read-tests * Reduce and comment axi-reads-calyx.futil * remove axi-reads.v from being tracked * add a todo * add required ARPROT signal. This is hardcoded to be priviliged * rename directories to yxi/axi-calyx * remove a guard in favor of 1'b1 to simplify reading of source code --------- Co-authored-by: Rachit Nigam <rachit.nigam12@gmail.com>

nathanielnrn added the C: FPGA Changes for the FPGA backend label Dec 21, 2023

nathanielnrn requested review from sampsyo and rachitnigam December 21, 2023 03:10

nathanielnrn mentioned this pull request Dec 21, 2023

Create an AXI-interface generator implemented Calyx #1733

Open

30 tasks

sampsyo reviewed Dec 21, 2023

View reviewed changes

nathanielnrn force-pushed the axi-calyx-gen branch from 45b67b3 to 985052d Compare December 21, 2023 19:59

nathanielnrn and others added 23 commits December 22, 2023 11:46

init commit of hardcoded axi wrapper for a 'main' kernel

369f443

add axi-reads-calix

22b77fd

hook up inputs to channels in the wrapper. tbd if this works

f9a35d5

Track output of compiled calyx read channel

af62349

Maybe this shouldn't be here, but for now (having deleted my working directory earlier) putting it here

update gitignore to get rid of sim_build and other cocotb artifacts

3b8c435

Working make files for running cocotb tests

556ffc7

Simply run make from the cocotb directory and axi-read-tests will be executed

Add xID signals for cocotb compatability

aba6ac9

We tie ARID low in our manager

Fix prefix issue on cocotb axi test bench

662ec67

Prefixes should not contain trailing "_"

commit to repro 'make WAVES=1' cocotb error from axi-reads-calyx.futil

345f2db

axi-reads patch

c303644

sync debug

1f98f41

Add txn_len initialization to 16 in calyx program

85e889e

Add integer byte conversion for tests on Calyx AXI testharness

a33f4d8

WIP get reads to work. Add incr_curr_addr group

3aadee2

This is part of read channel control sequence

remove .fst from tracking

2cc2b17

Add more data to testbench to make waveform viewing easier

66c6f58

Reads seem to be terminating correctly at RLAST

f6d6608

AR transfers seem to work, valid is high for 1 cycle

5b9fa21

Unreduced axi-reads-calyx.futil

a907fc4

Also reduces data bus width to 32

Cocotb testbench now passes

ece6bdd

Formatted and passing axi-read-tests

f7cecf3

nathanielnrn added 6 commits December 22, 2023 11:46

Reduce and comment axi-reads-calyx.futil

581f8cf

remove axi-reads.v from being tracked

5d87b2a

add a todo

af0f8af

add required ARPROT signal. This is hardcoded to be priviliged

acd0bb4

rename directories to yxi/axi-calyx

9787aba

remove a guard in favor of 1'b1 to simplify reading of source code

f78063f

nathanielnrn force-pushed the axi-calyx-gen branch from 985052d to f78063f Compare December 22, 2023 16:47

rachitnigam mentioned this pull request Dec 26, 2023

Zero-cycle transitions from dynamic to static control #1828

Closed

nathanielnrn merged commit 662cb88 into main Jan 6, 2024
5 checks passed

nathanielnrn deleted the axi-calyx-gen branch January 6, 2024 15:45

nathanielnrn mentioned this pull request Jan 10, 2024

Promote AXI-Read implementation to a complete AXI wrapper #1842

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Working Calyx Implementation of AXI Read channels #1820

Working Calyx Implementation of AXI Read channels #1820

nathanielnrn commented Dec 21, 2023

sampsyo commented Dec 21, 2023

sampsyo left a comment

rachitnigam commented Dec 26, 2023

sampsyo commented Jan 4, 2024

Working Calyx Implementation of AXI Read channels #1820

Working Calyx Implementation of AXI Read channels #1820

Conversation

nathanielnrn commented Dec 21, 2023

sampsyo commented Dec 21, 2023

sampsyo left a comment

Choose a reason for hiding this comment

rachitnigam commented Dec 26, 2023

sampsyo commented Jan 4, 2024