This tutorial shows you how to use the AXI ACP on the UltraZed-EG IOCC board under bare-metal and Linux.
- Vivado 2017.2
- UltraZed-EG IOCC (xczu3eg-sfva625-1-i)
- QuestaSim (Vivado xsim does not work yet)
- A Linux computer
source ${VIVADO_INSTALL_DIR}/settings64.sh
- Start Vivado with:
vivado
- Create a new project for the UltraZed-EG IOCC (xczu3eg-sfva625-1-i) (at:
${VIVADO_PROJECT_ROOT}
, name:${VIVADO_PROJECT_NAME}
) - Create a new AXI4 IP by going to Tools -> Create and Package New IP...
- Click Next >
- Choose Create AXI4 Peripheral and click Next >
- Choose a name (here:
ultrazed_acp_example
) and path${IP_DIR}
- Click Next >
- Edit the Inferface S00_AXI such that it has 32 Regsiters instead of 4
- Add another Interface using the +-sign
- Set the Interface Type to Full and the Interface Mode to Master
- Click Next >
- Choose Edit IP and click Next > Now a new AXI IP-Block was created and can be edited
- Open the file acp_dummy_v1_0 in the Sources pane in Vivado
- Copy the file contents from acp_dummy_v1_0.v into the file you just opened in Vivado
- Open the file acp_dummy_v1_0_M00_AXI in the Sources pane in Vivado
- Copy the file contents from acp_dummy_v1_0_M00_AXI.v into the file you just opened in Vivado
- Open the file acp_dummy_v1_0_S00_AXI in the Sources pane in Vivado
- Copy the file contents from acp_dummy_v1_0_S00_AXI.v into the file you just opened in Vivado
- Press the +-Button in the Sources pane to add new sources
- Click Add Files in the Add Sources dialog
- Select xilinx_bram.v, bram.v, bram_output_fifo.v
- Press Finish to close the dialog
- The newly added files will now appear in the source tree under acp_dummy_v1_0
- Open the directory where you created the acp_dummy IP and navigate to the hdl directory
${IP_DIR}/hdl
- Copy the testbench files acp_dummy_v1_0_tb.sv and acp_dummy_v1_0_tb_questa_rtl.do into the hdl directory
- In Vivado open the Package IP - acp_dummy Tab navigate to File Groups in Packaging Steps and click on Merge changes from File Groups Wizard
- In Packaging Steps go to Customization Parameters in the table view right click on C_M00_AXI_DATA_WIDTH and select Edit Parameter
- In the Dialog edit the List of Values and change 32 to 128
- Click OK to close the dialog
- Click on Merge changes from Customization Parameter Wizard
- In Packaging Steps go to Review and Package and click on Re-Package IP. Now the new AXI4-IP has been created and can be used in Vivado block designs
The acp_dummy_v1_0.v contains the Verilog files for the BRAM, AXI-Slave and AXI-Master. acp_dummy_v1_0.v is a submodule of the testbench in acp_dummy_v1_0_tb.sv. The testbench implements an AXI-Lite Master to communicate with the AXI-Lite Slave in acp_dummy_v1_0_S00_AXI.v and an AXI-Full Slave (only the necessary parts are implemented) to send and receive data from acp_dummy_v1_0_M00_AXI.v. The testbench sends commands to the acp_dummy_v1_0.v to initialize an ACP-Burst transfer. At the top of acp_dummy_v1_0_tb.sv are several defines with which you can change how much data should be send over the ACP.
// all transactions over the AXI-Busses will be printed onto the command line
`define AXI_VERBOSE
// AX_CACHE value
`define AX_CACHE 32'h0000000f
// AX_USER value
`define AX_USER 32'h00000002
// from where in the DDR should data be read by the AXI-Master
`define SOURCE_ADDRESS 32'h21000000
// to which location in the DDR should the AXI-Master write
`define TARGET_ADDRESS 32'h28000000
// to which and from which address should the AXI-Master write in the BRAM
`define BRAM_ADDRESS 1024
// how many consecutive burst should be executed
`define NUM_BURSTS 3
// how many 128 Bit values should be transmitted in each burst
`define BURST_LENGTH 4
The testbench code starts at initial begin
and is pretty straight forward:
- The memory/DDR (
mem
) in the testbench is initialized with 32 Bit data words counting up from 0 starting at the source address. - Next the source address, burst length, number of bursts, and the bram address will be send to acp_dummy_v1_0.v slave registers.
- Afterwards the
read_data
bit will be set in the AXI slave registerslv_reg1
. This tells the AXI master to initialize a read burst from the DDR to the BRAM. - As long an AXI transaction is in progress (which can be composed out of several consecutive bursts) the
axi_bus_ready
bit of the AXI slave registerslv_reg0
is set to0
. The testbench polls this bit over the AXI-Lite bus until it is1
. - When the
axi_bus_ready
bit is1
the AXI read transaction is finished. The testbench sets theclear_interrupts
bit to1
to acknowledge that the transaction is finished this will also unset the bits which indicate that a read or write is finished or in progress. - When the read transaction is finished and acknowledged the target address, burst length, number of bursts, and the bram address for a write transaction will be send to acp_dummy_v1_0.v slave registers.
- Afterwards the
write_data
bit will be set in the AXI slave registerslv_reg1
. This tells the AXI master to initialize a write burst from the BRAM to the DDR. - Again the
axi_bus_ready
bit is set to0
as long the AXI transaction is not finished. - When the
axi_bus_ready
bit becomes1
the testbench aknowledges the finished transaction with the setting of theclear_interrupts
bit.
During the simulation of the testbench you can see a log in the simulator which tells you about the AXI-Lite and ACP Transactions and also prints the content of the DDR (mem
) and the BRAM at each relevant step. At the end of the simulation you should see a log which looks like this:
Check data in DDR:
SOURCE: TARGET:
21000000 : 0x0000000100000000 28000000 : 0x0000000100000000 OK
21000008 : 0x0000000300000002 28000008 : 0x0000000300000002 OK
21000010 : 0x0000000500000004 28000010 : 0x0000000500000004 OK
21000018 : 0x0000000700000006 28000018 : 0x0000000700000006 OK
21000020 : 0x0000000900000008 28000020 : 0x0000000900000008 OK
21000028 : 0x0000000b0000000a 28000028 : 0x0000000b0000000a OK
21000030 : 0x0000000d0000000c 28000030 : 0x0000000d0000000c OK
21000038 : 0x0000000f0000000e 28000038 : 0x0000000f0000000e OK
21000040 : 0x0000001100000010 28000040 : 0x0000001100000010 OK
21000048 : 0x0000001300000012 28000048 : 0x0000001300000012 OK
21000050 : 0x0000001500000014 28000050 : 0x0000001500000014 OK
21000058 : 0x0000001700000016 28000058 : 0x0000001700000016 OK
21000060 : 0x0000001900000018 28000060 : 0x0000001900000018 OK
21000068 : 0x0000001b0000001a 28000068 : 0x0000001b0000001a OK
21000070 : 0x0000001d0000001c 28000070 : 0x0000001d0000001c OK
21000078 : 0x0000001f0000001e 28000078 : 0x0000001f0000001e OK
21000080 : 0x0000002100000020 28000080 : 0x0000002100000020 OK
21000088 : 0x0000002300000022 28000088 : 0x0000002300000022 OK
21000090 : 0x0000002500000024 28000090 : 0x0000002500000024 OK
21000098 : 0x0000002700000026 28000098 : 0x0000002700000026 OK
210000a0 : 0x0000002900000028 280000a0 : 0x0000002900000028 OK
210000a8 : 0x0000002b0000002a 280000a8 : 0x0000002b0000002a OK
210000b0 : 0x0000002d0000002c 280000b0 : 0x0000002d0000002c OK
210000b8 : 0x0000002f0000002e 280000b8 : 0x0000002f0000002e OK
This log shows that the data from DDR address 0x21000000
(SOURCE_ADDRESS
) was successfully transferred to the BRAM and from there written to DDR address 0x28000000
(TARGET_ADDRESS
).
The following picture shows the sturcture of the testbench and the connection to the other Verilog files.
slv_reg0
: status register (read-only)
[0:0]
: axi_bus_ready
slv_reg1
: command register
[1:1]
: read_data
[2:2]
: write_data
[31:31]
: clear_interrupts
slv_reg2
: ddr_start_address
slv_reg3
: assign burst_length
slv_reg4
: num_bursts
slv_reg5
: bram_start_address
slv_reg29
: AX_CACHE
[3:0]
: axcache_value
slv_reg30
: AX_USER
[1:0]
: axuser_value
slv_reg31
: DEADBEFF
(read-only)
- Navigate to the AXI IP directory of the acp_dummy
- Lauch QuestaSim
- Execute the acp_dummy_v1_0_tb_questa_rtl.do
cd ${IP_DIR}/hdl
vsim
do acp_dummy_v1_0_tb_questa_rtl.do
In the previously created Vivado project ultrazed_acp_dummy
:
- In the Flow Navigator pane under IP INTEGRATOR click on Create Block Design
- In the appearing dialog click OK
- In the Diagram window click right and choose Add IP
- Search for Zynq and double-click on Zynq UltraScale+ MPSoC
- Above the Diagram window click on Run Block Automation
- Click on OK
- Double-click on the Zynq UltraSCALE+ in the Diagram window
- Go to Page Navigator -> I/O Configuration
- Unfold High Speed in the I/O Configuration window
- Uncheck Display Port
- Go to Page Navigator -> PS-PL Configuration
- Unfold PS-PL Interfaces -> Slave Interface -> S AXI ACP
- Select 1 in the drop-down menu.
- Click on OK
- In the Diagram window click right and choose Add IP
- Search for acp dummy and double-click on acp_dummy_v1.0
- Above the Diagram window clock on Run Connection Automation
- Check All Automation
- Click on OK
- Go to the Sources tab and right-click on design_1 (design_1.bd) and choose Create HDL Wrapper
- In the Diagram window click right and choose Add IP
- Search for constant and double-click on Constant
- Double-click on Constant
- Set Const Val to 0
- Hover with the curser over the right-hand connection of Constant until a pencil appears
- Click and draw a line to the pl_acpinact connection of the Zynq UltraSCALE+
- Go to the Sources tab and right-click on design_1 (design_1.bd) and choose Create HDL Wrapper
- Click on OK
- Ignore the Warning and click on OK
- Go to Flow Navigator -> Project Manager -> PROGRAM AND DEBUG and click Generate Bitstream
- Click on Save
- Click on Yes
- Click on OK
- When the synthesis, implementation and writing bitstream is completed click on OK
- Go to Files -> Export -> Export Hardware...
- Check Include bitstream and click on OK
After the synthesis and implementation steps are successfully completed you can transfer the bitstream onto the PL (FPGA) and run a C program on the APU (CPU).
Make sure your UltraZed-EG IOCC is connected correctly and is set to JTAG-Mode. Turn it on and open a new terminal window and use picocom to connect to the UART with:
picocom /dev/ttyUSB1 -b 115200 -d 8 -y n -p 1
- In Vivado go to Files -> Lauch SDK
- In the SDK go to File -> New -> Application Project
- Choose a name (here:
acp_dummy_test
) and leave everything else at the default setting and click on Next > - Choose under Available Templates Hello World and click on Finish
- In the dialog click on Yes
- In the Project Explorer pane navigate to acp_dummy_test -> src -> helloworld.c and double-click on helloworld.c
- In the toolbar click on the hammer to build the Hello World program
- In the menu go to Xilinx -> Program FPGA
- In the menu go to Run -> Run
- In the dialog choose Lauch on Hardware (System Debugger)
- If the message Error while lauching program ... Cannot read r0 ... appears click on Okay
- In the menu go to Run -> Run Configurations...
- In the Target Setup scroll check Reset entire system and click Run
- Now you should see Hello World in the terminal window where picocom is running
- Replace the entire content of helloworld.c with the content of acp_dummy_test.c
- Again go Run -> Run in the menu
- Now the same procedure as in the simulation is performed but this time on the real hardware
The log in the picocom terminal window should look like this:
Check data in DDR:
SOURCE: TARGET:
21000000 : 0x0000000100000000 28000000 : 0x0000000100000000 OK
21000008 : 0x0000000300000002 28000008 : 0x0000000300000002 OK
21000010 : 0x0000000500000004 28000010 : 0x0000000500000004 OK
21000018 : 0x0000000700000006 28000018 : 0x0000000700000006 OK
21000020 : 0x0000000900000008 28000020 : 0x0000000900000008 OK
21000028 : 0x0000000b0000000a 28000028 : 0x0000000b0000000a OK
21000030 : 0x0000000d0000000c 28000030 : 0x0000000d0000000c OK
21000038 : 0x0000000f0000000e 28000038 : 0x0000000f0000000e OK
21000040 : 0x0000001100000010 28000040 : 0x0000001100000010 OK
21000048 : 0x0000001300000012 28000048 : 0x0000001300000012 OK
21000050 : 0x0000001500000014 28000050 : 0x0000001500000014 OK
21000058 : 0x0000001700000016 28000058 : 0x0000001700000016 OK
21000060 : 0x0000001900000018 28000060 : 0x0000001900000018 OK
21000068 : 0x0000001b0000001a 28000068 : 0x0000001b0000001a OK
21000070 : 0x0000001d0000001c 28000070 : 0x0000001d0000001c OK
21000078 : 0x0000001f0000001e 28000078 : 0x0000001f0000001e OK
21000080 : 0x0000002100000020 28000080 : 0x0000002100000020 OK
21000088 : 0x0000002300000022 28000088 : 0x0000002300000022 OK
21000090 : 0x0000002500000024 28000090 : 0x0000002500000024 OK
21000098 : 0x0000002700000026 28000098 : 0x0000002700000026 OK
210000a0 : 0x0000002900000028 280000a0 : 0x0000002900000028 OK
210000a8 : 0x0000002b0000002a 280000a8 : 0x0000002b0000002a OK
210000b0 : 0x0000002d0000002c 280000b0 : 0x0000002d0000002c OK
210000b8 : 0x0000002f0000002e 280000b8 : 0x0000002f0000002e OK
In order to use the AXI ACP under Linux you should take a look at this tutorial: Boot Linux on UltraZed-EG and instead of creating a new Vivado project use the Vivado project created in this tutorial. You can skip the steps where a Petalinux application is created. Follow the steps to setup an Ubuntu Linux system with the bitstream created in this tutorial in which you are able install software via apt
. Boot the system and install the build-essential
package:
sudo apt install build-essential
- Copy the generated bitstream file
${BITSTREAM_NAME}.bit.bin
into your home directory on the UltraZed-EG IOCC. - Switch to root user:
sudo su
- Create a directory for the bitstream:
mkdir -p /lib/firmware
- Move the bitstream file from your home directory into
/lib/firmware
with:mv -f ${BITSTREAM_NAME}.bit.bin /lib/firmware
- Load the bitstream onto the PL/FPGA with:
echo ${BITSTREAM_NAME}.bit.bin /sys/class/fpga_manager/fpga0/firmware
. - The successful loading of the bitstream is indicated by the blue LED on the UltraZed-SOM.
After you successfully setup a Ubuntu system and loaded the correct bitstream onto the PL/FPGA you can copy and compile the userland driver. The C code for the userland driver can be found in acp_dummy_test.c.
- On the Ubuntu system open a new file with your favorite editor
vim acp_dummy_test.c
- Copy the contents of acp_dummy_test.c into the newly created file
- Compile the program with:
gcc -o acp_dummy_test acp_dummy_test.c
- Switch to root user:
sudo su
- Run the program with:
./acp_dummy_test
After you ran the program (twice) you should see an output like this:
Check data in DDR:
SOURCE: TARGET:
21000000 : 0x0000000100000000 28000000 : 0x0000000100000000 OK
21000008 : 0x0000000300000002 28000008 : 0x0000000300000002 OK
21000010 : 0x0000000500000004 28000010 : 0x0000000500000004 OK
21000018 : 0x0000000700000006 28000018 : 0x0000000700000006 OK
21000020 : 0x0000000900000008 28000020 : 0x0000000900000008 OK
21000028 : 0x0000000b0000000a 28000028 : 0x0000000b0000000a OK
21000030 : 0x0000000d0000000c 28000030 : 0x0000000d0000000c OK
21000038 : 0x0000000f0000000e 28000038 : 0x0000000f0000000e OK
21000040 : 0x0000001100000010 28000040 : 0x0000001100000010 OK
21000048 : 0x0000001300000012 28000048 : 0x0000001300000012 OK
21000050 : 0x0000001500000014 28000050 : 0x0000001500000014 OK
21000058 : 0x0000001700000016 28000058 : 0x0000001700000016 OK
21000060 : 0x0000001900000018 28000060 : 0x0000001900000018 OK
21000068 : 0x0000001b0000001a 28000068 : 0x0000001b0000001a OK
21000070 : 0x0000001d0000001c 28000070 : 0x0000001d0000001c OK
21000078 : 0x0000001f0000001e 28000078 : 0x0000001f0000001e OK
21000080 : 0x0000002100000020 28000080 : 0x0000002100000020 OK
21000088 : 0x0000002300000022 28000088 : 0x0000002300000022 OK
21000090 : 0x0000002500000024 28000090 : 0x0000002500000024 OK
21000098 : 0x0000002700000026 28000098 : 0x0000002700000026 OK
210000a0 : 0x0000002900000028 280000a0 : 0x0000002900000028 OK
210000a8 : 0x0000002b0000002a 280000a8 : 0x0000002b0000002a OK
210000b0 : 0x0000002d0000002c 280000b0 : 0x0000002d0000002c OK
210000b8 : 0x0000002f0000002e 280000b8 : 0x0000002f0000002e OK
The program uses /dev/mem
for communicating with the AXI Slave registers by mapping the physical addresses of the AXI Slave into the virtual memory. Additionally two other memory maps are created which map 4kB each of the physical memory addresses of SOURCE_ADDRESS
and TARGET_ADDRESS
to virtual memory. As in the bare-metal example the acp_dummy
reads data from SOURCE_ADDRESS
into its BRAM and writes the data to TARGET_ADDRESS
.
To develop a kernel driver you can't use the installed Ubuntu on the UltraZed-EG IOCC since the libraries to compile the kernel drivers are not present on the UltraZed-EG IOCC Ubuntu Linux installation. You have to use the Petalinux environment to develop and build a kernel driver/module.
- Navigate to the root directory of the formerly created Petalinux project
cd ${PETALINUX_PROJECT_ROOT}
- Create a new module (kernel driver) with:
petalinux-create -t modules --name acpdummytest --enable
- Navigate to the newly created module
cd ${PETALINUX_PROJECT_ROOT}/project-spec/meta-user/recipes-modules/acpdummytest/files
- Open the file
acpdummytest.c
with your favorite editor. - Copy the contents of acp_dummy_test_driver.c into the file
acpdummytest.c
- Compile the kernel module with
petalinux-build
. - After the compilation is finished the compiled kernel module is in
${PETALINUX_PROJECT_ROOT}/build/tmp/sysroots/plnx_aarch64/lib/modules/4.9.0-xilinx-v2017.2/extra/acpdummytest.ko
- Copy the
acpdummytest.ko
to the Ubuntu filesystem running on the boardscp ${USER}@${BOARD_IP}:~
- Connect to the board via
ssh ${USER}@${BOARD_IP}
- Switch to root user:
sudo su
- Move
acpdummytest.ko
to the kernel modules directory:mv /home/${USER}/acpdummytest.ko /lib/modules/4.9.0-xilinx-v2017.2/extra
- Install the kernel module:
insmod /lib/modules/4.9.0-xilinx-v2017.2/extra/acpdummytest.ko
- Get the major device number
${MAJOR}
with:dmesg
- Create a new device:
mknod /dev/acp_dummy c ${MAJOR} c
Now the acp_dummy
is accessible through the Linux file system as the character device /dev/acp_dummy
.
(In order to unload the kernel module either reboot
or delete the devive rm -rf /dev/acp_dummy
and remove the kernel module rmmod /lib/modules/4.9.0-xilinx-v2017.2/extra/acpdummytest.ko
).
To use the acp_dummy
character device you can use a short C program. Login to the Ubuntu Linux on the UltraZed-EG IOCC as a user.
- On the Ubuntu system open a new file with your favorite editor
vim acp_dummy_test_kernel.c
- Copy the contents of acp_dummy_test.c into the newly created file
- Compile the program with:
gcc -o acp_dummy_test acp_dummy_test_kernel.c
- Run the program with:
sudo ./acp_dummy_test
After you ran the program (twice) you should see an output like this:
Check data in DDR:
SOURCE: TARGET:
0 : 0x0000000100000000 0 : 0x0000000100000000 OK
1 : 0x0000000300000002 1 : 0x0000000300000002 OK
2 : 0x0000000500000004 2 : 0x0000000500000004 OK
3 : 0x0000000700000006 3 : 0x0000000700000006 OK
4 : 0x0000000900000008 4 : 0x0000000900000008 OK
5 : 0x0000000b0000000a 5 : 0x0000000b0000000a OK
6 : 0x0000000d0000000c 6 : 0x0000000d0000000c OK
7 : 0x0000000f0000000e 7 : 0x0000000f0000000e OK
8 : 0x0000001100000010 8 : 0x0000001100000010 OK
9 : 0x0000001300000012 9 : 0x0000001300000012 OK
10 : 0x0000001500000014 10 : 0x0000001500000014 OK
11 : 0x0000001700000016 11 : 0x0000001700000016 OK
12 : 0x0000001900000018 12 : 0x0000001900000018 OK
13 : 0x0000001b0000001a 13 : 0x0000001b0000001a OK
14 : 0x0000001d0000001c 14 : 0x0000001d0000001c OK
15 : 0x0000001f0000001e 15 : 0x0000001f0000001e OK
16 : 0x0000002100000020 16 : 0x0000002100000020 OK
17 : 0x0000002300000022 17 : 0x0000002300000022 OK
18 : 0x0000002500000024 18 : 0x0000002500000024 OK
19 : 0x0000002700000026 19 : 0x0000002700000026 OK
20 : 0x0000002900000028 20 : 0x0000002900000028 OK
21 : 0x0000002b0000002a 21 : 0x0000002b0000002a OK
22 : 0x0000002d0000002c 22 : 0x0000002d0000002c OK
23 : 0x0000002f0000002e 23 : 0x0000002f0000002e OK
The whole AXI/ACP communication is now abstracted into the kernel module. In the C program you have only two char
buffers. First data is written into the buffer_0
then the whole content of the buffer is written into the BRAM of the acp_dummy
(write(fd, buffer_0, BUFFER_LENGTH)
). After that the data from the BRAM is read into the buffer_1
(read(fd, buffer_1, BUFFER_LENGTH)
).
In the previous examples you had to run the C program twice to see the correct data at the target memory location. This is due to the fact that the source and the target memory for the communiction with the ACP are mapped to memory addresses which are part of the System RAM. You can see that by running executing sudo cat /proc/iomem
:
00000000-7fffffff : System RAM
00080000-00c0ffff : Kernel code
00c90000-00d87fff : Kernel data
An explanation what is happening when ioremap
is called on a System RAM address is given here on Stack Overflow. In order to exclude a part of the RAM from the administration of the Linux kernel you have to shrink the memory map (or edit the device tree). The easiest way to exclude a part of the RAM is by setting custom bootargs while booting.
- Reboot the UltraZed-EG IOCC.
- while booting the following messages will be shown:
ethernet@ff0e0000 Waiting for PHY auto negotiation to complete...... done BOOTP broadcast 1 BOOTP broadcast 2 DHCP client bound to address 10.42.0.47 (257 ms) Hit any key to stop autoboot: 0
- Hit any key to stop the autoboot process.
- Now you can see a prompt:
ZynqMP>
- Type the following into the prompt:
setenv bootargs 'earlycon clk_ignore_unused root=/dev/mmcblk1p2 mem=1920M rw rootwait'
and press enter this will shrink the kernel memory map from 2048MB to 1920MB. - Type
boot
and press enter to boot Ubuntu. - When you logged in into the Ubuntu Linux execute:
sudo cat /proc/iomem
. This will show you the now shrinked memory map.00000000-77ffffff : System RAM 00080000-00c0ffff : Kernel code 00c90000-00d87fff : Kernel data
- Set the values
SOURCE_ADDRESS
andTARGET_ADDRESS
to addresses which are above0x77ffffff
in the Kernel module. - Compile, copy and install the kernel module on the UltraZed-EG IOCC.
- Run the C program which uses the kernel module and you should see the correct data at target memory address after only one execution.
If you would like to adjust the values of AX_CACHE
and AX_USER
please take a look at those references to choose the correct settings: