Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xilinx] Use a single AXI interface to communicate with all memories #853

Closed
sampsyo opened this issue Jan 3, 2022 · 8 comments · Fixed by #958
Closed

[xilinx] Use a single AXI interface to communicate with all memories #853

sampsyo opened this issue Jan 3, 2022 · 8 comments · Fixed by #958
Labels
C: Calyx Extension or change to the Calyx IL Type: Bug Bug in the implementation

Comments

@sampsyo
Copy link
Contributor

sampsyo commented Jan 3, 2022

Using Vivado/Vitis to compile many Calyx-generated Verilog programs currently produces this totally inscrutable error, first reported by @yn224:

ERROR: [VPL 60-773] In '/tmp/tmp4lfho6qg/_x/link/vivado/vpl/runme.log', caught Tcl error:
ERROR: [Common 17-161] Invalid option value '' specified for 'objects'.

I did a lot of digging around into the generated Tcl from which this error arises. It appears that the problem comes up when the design has multiple AXI interfaces, which happens for multiple @external Calyx memory declarations. You can see this by comparing these programs:

$ fud e examples/futil/dot-product.futil -s xclbin.save_temps true -o one.xclbin
$ fud e examples/tutorial/language-tutorial-iterate.futil -s xclbin.save_temps true -o one.xclbin

The dot-product example has three external memories, while language-tutorial-iterate has only one. And only the former triggers the problem.

The solution will be to multiplex all the memories on a single AXI port. The memories will just be mapped into different address ranges on the same, general port. Here’s where we declare one AXI port per memory:
https://github.com/cucapra/calyx/blob/9d02b33c03ac5b96bde1b930c515a095bb7c0ad3/src/backend/xilinx/xml.rs#L149-L156

We will also need to change the Verilog generation for the AXI interfaces to use a single port, somewhere around here:
https://github.com/cucapra/calyx/blob/9d02b33c03ac5b96bde1b930c515a095bb7c0ad3/src/backend/xilinx/memory_axi.rs#L27

It would also be good to set up some automated (Runt-based) testing of the whole Xilinx toolchain thing so we don’t break stuff while trying to change this.


Details about the debugging process, for posterity & the morbidly curious

The error arose from dr.bd.tcl, which I got from the runme.log referenced in the error. This is the Tcl script that Vivado generates according to kernel.xml, the file we generate that describes the interface to the RTL blob. The XML file contains directives like this describing AXI interfaces:

    <ports>
      <port name="s_axi_control" mode="slave" range="0x1000" dataWidth="32" portType="addressable" base="0x0"/>
      <port name="m0_axi" mode="master" range="0xFFFFFFFFFFFFFFFF" dataWidth="64" portType="addressable" base="0x0"/>
      <port name="m1_axi" mode="master" range="0xFFFFFFFFFFFFFFFF" dataWidth="64" portType="addressable" base="0x0"/>
      <port name="m2_axi" mode="master" range="0xFFFFFFFFFFFFFFFF" dataWidth="64" portType="addressable" base="0x0"/>
    </ports>
    <args>
      <arg name="timeout" addressQualifier="0" id="0" port="s_axi_control" size="0x4" offset="0x010" hostOffset="0x0" hostSize="0x4" type="uint"/>
      <arg name="A0" addressQualifier="1" id="1" port="m0_axi" size="0x8" offset="0x18" hostOffset="0x0" hostSize="0x8" type="int*"/>
      <arg name="B0" addressQualifier="1" id="2" port="m1_axi" size="0x8" offset="0x20" hostOffset="0x0" hostSize="0x8" type="int*"/>
      <arg name="v0" addressQualifier="1" id="3" port="m2_axi" size="0x8" offset="0x28" hostOffset="0x0" hostSize="0x8" type="int*"/>
    </args>

(This is a snippet from the XML we generate for dot-product.futil, FWIW.) As you can see, there are four AXI ports for three memories: one is a control port. Then, there are three arg declarations for the externalized memories in the Calyx program (and a timeout argument also).

These AXI port declarations get translated to hbm_memory_subsystem::map_memory_resource calls in the Vivado-generated Tcl. I noticed that the error was coming from the second one, i.e., the call for m1_axi. I got curious and compared this against the XML for one of the examples that comes with the Vitis RTL kernel tutorials. Once packaged up, that kernel.xml looks different:

    <ports>
      <port name="m_axi_gmem" mode="master" range="0xFFFFFFFF" dataWidth="32" portType="addressable" base="0x0"/>
      <port name="s_axi_control" mode="slave" range="0x1000" dataWidth="32" portType="addressable" base="0x0"/>
    </ports>
    <args>
      <arg name="a" addressQualifier="1" id="0" port="m_axi_gmem" size="0x8" offset="0x10" hostOffset="0x0" hostSize="0x8" type="void*"/>
      <arg name="b" addressQualifier="1" id="1" port="m_axi_gmem" size="0x8" offset="0x1c" hostOffset="0x0" hostSize="0x8" type="void*"/>
      <arg name="c" addressQualifier="1" id="2" port="m_axi_gmem" size="0x8" offset="0x28" hostOffset="0x0" hostSize="0x8" type="void*"/>
      <arg name="length_r" addressQualifier="0" id="3" port="s_axi_control" size="0x4" offset="0x34" hostOffset="0x0" hostSize="0x4" type="ap_uint&lt;32>"/>
    </args>

Namely, while there are still several arg declarations, they all use a single AXI port: m_axi_gmem. They just use different offsets within it.

That’s when I formed the hypothesis that the Xilinx toolchain just doesn’t support multiple full-sized AXI ports. I validated this hypothesis by trying a different example program with a single external memory, which turned out to be language-tutorial-iterate.futil, referenced above.

@sampsyo sampsyo added Type: Bug Bug in the implementation C: Calyx Extension or change to the Calyx IL labels Jan 3, 2022
@cgyurgyik
Copy link
Collaborator

The solution will be to multiplex all the memories on a single AXI port. The memories will just be mapped into different address ranges on the same, general port.

From the perspective of a Calyx user, it seems the onus is on the backend to ensure this constraint is met?

@sampsyo
Copy link
Contributor Author

sampsyo commented Jan 5, 2022

Ah, yes! To clarify, that's the intent in this issue: to make the xilinx and xilinx-xml backends use this single-port, multiple-offset strategy. (Without requiring any changes to the Calyx program, which will still just use multiple @external-annotated memories.)

@rachitnigam
Copy link
Contributor

@sampsyo did you get a chance to see what happens if you just delete the XML lines that instantiate the m1_axi and m2_axi interfaces? If the only difference between the example and our generated XML is that unnecessary lines in the XML maybe deleting them would magically work (I might be too hopeful here).

@sampsyo
Copy link
Contributor Author

sampsyo commented Jan 5, 2022

Just following up based on a synchronous conversation, for posterity: although it could be interesting, I didn't end up trying this at the time because I believe it would need to be a coordinated change between the XML and the Verilog that implements the AXI interfaces. I worried I would just trigger more inscrutable errors by creating a misalignment between the signals in the Verilog and the declared AXI interfaces that refer to them.

@sampsyo
Copy link
Contributor Author

sampsyo commented Jan 6, 2022

An alternative explanation that I need to investigate: it looks like we redundantly declare the AXI interfaces in the Tcl script that produces .xo files. Here are those lines in #855:
https://github.com/cucapra/calyx/blob/13c5d9c0c1593532fb4065f6df22f818597838f8/fud/bitstream/gen_xo.tcl#L23-L25

So maybe we can keep the multi-interface style but just properly hook them up in the .xo and everything will work.

@yn224
Copy link
Contributor

yn224 commented Jan 6, 2022

An alternative explanation that I need to investigate: it looks like we redundantly declare the AXI interfaces in the Tcl script that produces .xo files. Here are those lines in #855:

https://github.com/cucapra/calyx/blob/13c5d9c0c1593532fb4065f6df22f818597838f8/fud/bitstream/gen_xo.tcl#L23-L25

So maybe we can keep the multi-interface style but just properly hook them up in the .xo and everything will work.

I'm not sure if the "redundancy" is referring to the fact that they are essentially doing the same thing for s_axi and m0_axi. But at my level of understanding, they are not redundant since they are assigning the same CLK to slave and master AXI interconnect (derived from naming convention found in page 62 of this document), which are different?

@sampsyo
Copy link
Contributor Author

sampsyo commented Jan 6, 2022

Ah, no, what I meant was that these lines seem to be redundant with what we do in kernel.xml: i.e., they both declare the interfaces. In an ideal world, I would like to imagine that all the information Vivado needs to know about the interface ports could come from kernel.xml. But the hypothesis here is that we need to tell it about the ports once in kernel.xml and then again via Tcl when packaging up the RTL.

@sampsyo sampsyo mentioned this issue Jan 15, 2022
8 tasks
@sampsyo
Copy link
Contributor Author

sampsyo commented Jan 27, 2022

In response to @yn224's comment at #876 (comment):

Great! This is such a relief!! It should be so much easier to address this way than by multiplexing multiple memories on a single AXI bus.

My suggestion for a next step is to put a Tcl for loop right in the Tcl script. This way, the bit of fud that invokes the script, here:
https://github.com/cucapra/calyx/blob/78c71a0d7b68d7ee2358fadab1d7898a2c091a1f/fud/fud/stages/xilinx/xclbin.py#L105-L112

…can just pass all the names of the AXI interfaces after -tclargs. So for the one-memory example we've been looking at, it would pass -tclargs xclbin/kernel.xo m0_axi. Then, the Tcl script will loop over all the arguments (except the first one, which is still the output file) and execute:

ipx::associate_bus_interfaces -busif $busname -clock ap_clk [ipx::current_core]

Let's do that first! @yn224, can you give that a shot and make a PR? In this initial version, we can hard-code the fud support to just pass m0_axi every time. This simple version should be fast and will set us up for the next phase.

Of course, the next step will be to figure out how to get that list of AXI interface names. We will either have to change the Calyx compiler to dump a list, or we can parse the XML it already generates to get the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Calyx Extension or change to the Calyx IL Type: Bug Bug in the implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants