-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][VTA] Support Intel FPGA in VTA #1694
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @liangfu, this is a promising start. A couple points: I think it might be sufficient to rename the path from intel_fpga to intel simply since hardware implies under this path FPGA hardware. In addition, there are leftover files that were copied which you can safely remove, such as compile_designs.py (which should also be eliminated from the main branch), vivado.tcl/hsi.tcl/hls.tcl which are all xilinx-specific.
Also I just ordered a couple DE10-nano boards so I can test out the Intel FPGA backend (I should have those by early next week). @liangfu Our first step should be to generate a complete VTA design with all of the HLS modules, connected via FIFOs, BRAMs, and Avalon bus to the memory controller / ACP port of the ARM SoC. From there on we can build unit tests in C to test basic functionality such as data transfer, and single tensor ops. |
@tmoreau89 the latest commit have safely remove those xilinx-specfic script files, and all the instruction components as well as the simulation program have been successfully compiled (not functional yet). I'm on the way to debug the modules and make it functional under simulation mode. However, would you kindly provide full debug log that is functional? This would be helpful to checkout the errors in migrating Xilinx HLS based implement to Intel HLS. In the mean time, I agree what our first step should be at this stage. |
@liangfu I see, what you would like are unit tests for each of the HLS modules to test the functions in isolation? I've been planning to do a simulation infrastructure revamp, but this could take a few days. In the meantime, can you reproduce the simulations using the xilinx toolchains? There's some guidance on how to run the simulation test. You can turn a DEBUG flag before compiling the design, or insert your own printf statements to obtain a more detailed trace. Let me know if you run into problems. |
@tmoreau89 I didn't expect HLS modules tests in isolation, instead, I'm currently running into existing simulation infrastructure. Thanks to your debugging guidance, I've just installed Xilinx toolchains and started to compare the output results, which is helpful in debugging into the migrated version. Good news is that I have successfully migrated ALU modules in simulation mode. However, as Intel HLS don't support |
@liangfu thanks for the update, I'm glad you've been able to test the compute module. Volatile may not be necessary for the Intel toolchains - it was necessary for Vivado since simulation would not behave correctly if the |
@tmoreau89 I've successfully performed gemm in simulation lately, and cleaned up unused code. However, when I generate hardware with the same design, there is a small section that constantly causes hardware generation failure: // Store to accum memory/store buffer
if (alu_opcode == VTA_ALU_OPCODE_MIN ||
alu_opcode == VTA_ALU_OPCODE_MAX) {
acc_mem[dst_idx][i] = cmp_res;
out_mem[dst_idx][i] = short_cmp_res;
} else if (alu_opcode == VTA_ALU_OPCODE_ADD) {
acc_mem[dst_idx][i] = add_res;
out_mem[dst_idx][i] = short_add_res;
} else if (alu_opcode == VTA_ALU_OPCODE_SHR) {
acc_mem[dst_idx][i] = shr_res;
out_mem[dst_idx][i] = short_shr_res;
} The debug level compilation error reports:
If we can ignore this section temporarily, the generated hardware looks fine. Here is the estimated resource allocation with current hardware design (targeting DE10-Nano):
|
@liangfu thank you for the update, this is looking promising. Would you mind summarizing the commands needed to run the synthesis and simulation for your WIP HLS modules with the Intel toolchains? |
Just enable MODE=sim in Makefile, it would use i++ to compile the HLS modules. Ob the other hand, I'm a bit worried about how to drive the generated hardware in software. I'm not quite familiar with this at the moment. |
…f each test function);
Conflicts: vta/tests/hardware/metal_test/Makefile_PYNQ.mk vta/tests/hardware/metal_test/metal_test.cc
Conflicts: 3rdparty/HalideIR 3rdparty/dlpack 3rdparty/dmlc-core
@nhynes This PR is still WIP. I would reply to your comments one-by-one, update the requested changes, and request for another round of review when I think this is ready. |
There seem to be a concurrent effort at #3258, closing this PR for now. |
This is an initial working in progress port of HLS based instruction design for Intel FPGA.
Please refer to RFC #1656 for more details.