This artifact includes 20 hardware bugs, each of them can be reproduced with Verilator in a push-button manner. It also includes the five tools we designed to help bug localization (i.e., SignalCat, FSM Monitor, Statistics Monitor, Dependency Monitor, and LossCheck), as well as examples of using these tools and the instructions of reproducing the figures in the paper.
The full list of 68 bugs we studied can be found here.
If you have an interesting bug that you can reproduce, feel free to submit a pull request and we will add it to this repo. If you notice a bug that's not reproducible but still want to share to others, you may request edit access to the spreadsheet and add it there.
Use the following command to download the artifact repository:
git clone --recursive https://github.com/efeslab/asplos22-hardware-debugging-artifact
After this command, you are expected to see the following directory hierarchy:
asplos22-hardware-debugging-artifact
├── hardware-bugbase
│ ├── c1-dead-lock-sdspi
│ ├── c2-producer-consumer-mismatch-optimus
│ ├── c3-signal-asynchrony-sdspi
│ ├── c4-signal-asynchrony-axi-stream-fifo
│ ├── common
│ ├── d10-failure-to-update-sha512
│ ├── d11-failure-to-update-frame-fifo
│ ├── d12-failure-to-update-frame-fifo
│ ├── d13-failure-to-update-frame-len
│ ├── d1-buffer-overflow-rsd
│ ├── d2-buffer-overflow-grayscale
│ ├── d3-buffer-overflow-optimus
│ ├── d4-buffer-overflow-frame-buffer
│ ├── d5-bit-truncation-sha512
│ ├── d6-bit-truncation-fft
│ ├── d7-misindexing-fadd
│ ├── d8-misindexing-axis-switch
│ ├── d9-endianness-mismatch-sdspi
│ ├── manual_debug_log
│ ├── n1-frame-len-failure-to-update
│ ├── n3-frame-fifo-fail-to-update
│ ├── n8-axis-adapter-incomplete-implementation
│ ├── n9-frame-fifo-failure-to-update
│ ├── s1-protocol-violation-axi-lite
│ ├── s2-protocol-violation-axi-stream
│ ├── s3-incomplete-implementation-axis-adapter
│ └── scripts
└── veripass
├── dbgtools
├── model
├── passes
├── Pyverilog
├── recording
├── utils
└── verilator
The hardware-bugbase
directory contains all the reproducible bugs. Each bug is located in a directory, together with a simplified code snippet that helps understanding.
You will need to install compile Verilator
to reproduce these bugs. Verilator
is located under the veripass
directory.
Before compilation, you will need to install a few dependencies:
sudo apt-get install perl python3 make autoconf g++ flex bison ccache
sudo apt-get install libgoogle-perftools-dev numactl perl-doc
sudo apt-get install libfl2 libfl-dev # Ubuntu only (ignore if gives error)
sudo apt-get install zlibc zlib1g zlib1g-dev # Ubuntu only (ignore if gives error)
Then compile Verilator
:
cd asplos22-hardware-debugging-artifact/veripass/verilator
autoconf
./configure
make -j8
Compilation is enough. You do not need to install it. Scripts in the bug database will find the location of Verilator
themselves.
Bugs are listed in the table below. You may cd
into the directory of each bug to reproduce it and read its documentations.
To reproduce a specific bug:
cd asplos22-hardware-debugging-artifact/hardware-bugbase/<bug-dir>
make -j8 # compile the verilog code for simulation
make sim # run the simulation
make wave # open the generated waveform with GTKWave
You are expected to see an error message after make sim
. make wave
requires you using GUI, or the DISPLAY
environment variable being set correctly. After simulation, you can also find a .fst
file or a .vcd
file under the directory. These are the waveforms generated by Verilator
. You can copy the file to another computer and open it with GTKWave
or other waveform-viewing software.
Our debugging tools locate in the veripass
directory. In the hardware-bugbase
directory, we provide make
scripts to invoke these debugging tools.
Warning: A full evaluation of this part takes days, because FPGA synthesis is slow (e.g., up to several hours per-run). We encourage you to evaluate the non-synthesis part (e.g., 2.3.1) first.
To run the debugging tools, you will need to compile Verilator
and Pyverilog
if you have not done so already:
cd asplos22-hardware-debugging-artifact/veripass
make -j8
And install the following python packages:
pip3 install jinja2 sympy ply gephistreamer
And add the following lines to your .bashrc
or .zshrc
to help the scripts find Vivado
, Quartus
, and VCS
. Vivado
must be the Design Suite
edition, Quartus
must be the Pro
edition with version 17.0
, and VCS
must be the MX
edition.
# Quartus Pro
export QUARTUS_HOME=<your-quartus-home>/17.0/quartus
export PATH=$QUARTUS_HOME/bin:$PATH
export LM_LICENSE_FILE=<your-quartus-license>
# Vivado
export XILINX_VIVADO=<your-vivado-home>/Vivado/2020.2
export PATH=$XILINX_VIVADO/bin:$PATH
export XILINXD_LICENSE_FILE=<your-vivado-license>
# VCS MX
export VCS_HOME=<your-vcs-home>
export PATH=$VCS_HOME/bin:$PATH
export SNPSLMD_LICENSE_FILE=<your-vcs-license>
In order to synthesize projects for Intel HARP, you will need to download a supported version of Intel FPGA Basic Building Blocks, a set of platform files for HARP, and have the following additional lines in .bashrc
or .zshrc
. You can ask your Intel contact for BBS_6.4.0
. You may want to read this to understand the interface of the HARP platform. It is theoretically possible to compile these HARP projects for the PAC platform (which is more widely available); however, we did not evaluate it.
export OPAE_PLATFORM_ROOT=<your-opae-platform-root-location>/BBS_6.4.0
export PATH=$OPAE_PLATFORM_ROOT/bin:$PATH
The original framework for HARP simulation requires Python 2 as the default python
command. As a result, you may need to set up a virtualenv
with the following command:
virtualenv --python=/usr/bin/python2 <path-to-virtualenv>
In Section 6.2 of the paper, we demonstrated that a developer can use SignalCat and the Monitors to localize all the 20 bugs in this artifact. We provide the mental debugging logs of a developer localizing these bugs in this sheet. For each bug, the sheet includes the tools the developer would use at each step. The configurations for invoking these tools are located in a .cfg
file under each bug's directory; you can invoke the tools using the following commands under each bug's directory:
make withtask.v
After running this command, a file called withtask.v
will be generated. This file contains the flattened verilog code with the debugging instrumentations described in the configuration.
To synthesize the instrumented circuit, you may run the following command:
source <path-to-virtualenv>/bin/activate # switch to a python virtualenv where python2 is the default
make sweep_depth
This command will generate a number of files (e.g., instrumented circuit with different buffer size, the TCL scripts to invoke synthesis, etc) and invoke the synthesis script for the circuit and run syntheses with different recording buffer size. This command would froze for a long time, because each synthesis takes hours.
After the command finishes, you can run the following command to report resource utilization.
make report_depth_sweep
For D4
, D6
, D7
, D8
, D9
, D11
, D12
, D13
, C1
, C3
, C4
, S1
, S2
, and S3
, you will see something like the following.
log2(Depth),10,11,12,13
Total LUTs,2225,2208,2191,2287
FFs,2870,2881,2892,2905
RAMB36,4,7,15,30
RAMB18,0,1,0,0
build_notask: Total LUTs,FFs,RAMB36,RAMB18
858;516;0;0
The upper block shows the resource utilization of instrumented circuit, and the bottom block shows the resource utilization of the uninstrumented circuit. In the paper, we use the word Logic
for LUT
, Register
for FF
, and calculate the total number of bits from RAM36
(36Kbit per instance) and RAM18
(18Kbit per instance). In the above example, the register overhead of an instrumented circuit with a 1024
-depth buffer is 2870-517=2354
.
For D1
, D2
, D3
, D5
, D10
, and C2
, you will see something like the following. We use the Logic
for ALM
, Register
for FF
, and use the number of BRAM Blocks
to calculate BRAM size (each block contains 20Kbits).
log2(Depth),10,11,12,13
ALM,101170,101173,101185,101191
BRAM#B,326,343,376,477
BRAMbit,3989920,4332960,5019040,6391200
FFs,111356,111371,111397,111447
build_notask: ALM BRAM#B BRAMbit FFs
100245;309;3646880;108734
Bug D1
, D2
, D3
, D4
, C2
, and C4
are the six data loss bugs that can be localized by LossCheck.
You can use the following command to invoke LossCheck under the directories of these four bugs.
make -f Makefile.lc
For D1
, D2
, D3
, and C4
:
This will generate two .v
files (e.g., a <benchmark>.losscheck.0.v
and a <benchmark>.losscheck.1.v
). <benchmark>.losscheck.0.v
is the first instrumentation, which does not filter false positives (as discussed in Section 4.5.3). Our scripts run the original testbench of the circuit on the first instrumentation, and generate a list of signals that should be filtered out (i.e., storing in filter.txt
). Then, our scripts invoke LossCheck again, generating the second instrumentation (i.e., <benchmark>.losscheck.1.v
), with the signals in filter.txt
filtered out.
For D4
and C4
:
This will generate a test.v
file, which is the flattened design with LossCheck's instrumentation. These two bugs do not need false positive filtering. As a result, no filter.txt
file will be generated.
To verify that the second instrumentation actually detects the data loss, run the following command:
make -f Makefile.lc sim
You are expected to see some error message with regard to data loss. For D2
, D3
, D4
, C2
, and C4
, there should be no false positives. For D1
, you are expected to see one register that's misidentified. (You will see several rows misreporting the same register.)
You can use the following command to synthesize the circuit with and without LossCheck instrumentation. Please note each synthesis can take hours.
make -f Makefile.lc synth
And use the following command to report resource utilization.
make -f Makefile.lc report_util
For D1
, D2
, D3
, and C2
, you will get something like this:
build_withlosscheck: ALM BRAM#B BRAMbit FFs
115428;775;11146672;139645
build_notask: ALM BRAM#B BRAMbit FFs
109694;413;5238192;130390
These four bugs are on the Intel HARP platform. This platform contains a vendor-provided shell and an user-implemented accelerator. Because the shell is a fixed region and is not usable by the accelerator, the resource overhead in Figure 3 is normalized to the total available resource of the accelerator region (i.e., without the shell). You may use the following data as the available resource of the accelerator-usable region.
ALM | FFs |
---|---|
327029 | 1600141 |
In the above example, the uninstremented accelerator uses 9523 ALMs, and the instrumented accelerator uses 15257 ALMs. As a result, the ALM (logic) overhead is (15257-9523)/327029=1.7%
.
Specifically, as we mentioned in our paper, the frequency of D3 and C2 (i.e., the Optimus bugs) will be reduced from 400MHz to 200MHz after LossCheck's instrumentation. As a result, we need to add an asynchronous fifo which helps clock domain crossing. When generating the verilog files for compilation, the makefile will add the fifo.
For D4
and C4
, you will get something like this:
build_withlosscheck: Total LUTs,FFs,RAMB36,RAMB18
1435;2415;16;1
build_notask: Total LUTs,FFs,RAMB36,RAMB18
45;83;0;0
These two bugs are on the Xilinx platform. There's no shell in the platform so the accelerator can use all resource on the FPGA. You may use the following data as the available resource.
LUT | FFs |
---|---|
203800 | 407600 |
This artifact includes modified versions of Pyverilog (veripass/Pyverilog
) and Verilator (veripass/verilator
), which are released under their original licenses. Bugs in the hardware-bugbase
directory are collected (and organized) from different sources, and are also released under the original licenses of the original implementation.
Our debugging tools under the veripass
directory are released under the GPLv3
license, whatever it means. Please also note that these tools are academic prototypes and may not be stable, reliable, or always correct; use it at your own risk.
By downloading/cloning/forking the veripass
repository, you have known and agreed to all terms included in GPLv3
, and that the developers/authors of these tools will not be responsible for any of your losses and/or damages, including but not limited to the tools not working as expected and your loved ones being unhappy of you working/hacking at 3am.
If you find our work interesting, please cite our paper.
@inproceedings{ma2022debugging,
title={Debugging in the Brave New World of Reconfigurable Hardware},
author={Ma, Jiacheng and Zuo, Gefei and Loughlin, Kevin and Zhang, Haoyang and Quinn, Andrew and Kasikci, Baris},
booktitle={Proceedings of the Twenty-Seventh International Conference on Architectural Support for Programming Languages and Operating Systems},
year={2022}
}