A floating-point matrix multiplication implemented in hardware.
NOTE
This project has been refactored, and updated to work with Vivado 2020.2. The previous implementation can be found in the branch vivado-2019.2.
This repo describes the implementation of a floating-point matrix multiplication on a Xilinx FPGA.
The hardware module implements the matrix product C = AB, where A, B, and C are 128 x 128 floating-point matrices.
This hardware accelerator provides a 3.4x speedup compared to NumPy.
- [hls] contains the accelerator c++ source code for high level synthesis.
- [boards/Pynq-Z1/] contains the Vivado project and overlays generated with
vivado
andvitis_hls
version 2020.2. - [notebooks] contains the Jupyter Notebook to evaluate the design.
- Copy the Jupyter notebook and the content of the corresponding overlays folder to the Jupyter notebooks area in the FPGA board (e.g. under
/home/xilinx/jupyter_notebooks/matmult
).
Requires Xilinx vivado
and vitis_hls
version 2020.2. If necessary, a different version can be configured in matmult.tcl.
- Build the
matmult
module:vitis_hls script.tcl
- Build the Vivado project:
cd boards/Pynq-Z1/matmult make clean && make all
- The original implementation borrowed ideas and code from this application note (Copyright (c) 2016, Xilinx, Inc.), and the PYNQ hello world example.
- Schematic of matrix multiplication taken from Wikipedia