[WIP][VTA] Support Intel FPGA in VTA #1694

liangfu · 2018-09-07T12:30:01Z

This is an initial working in progress port of HLS based instruction design for Intel FPGA.

Please refer to RFC #1656 for more details.

tmoreau89

Thank you @liangfu, this is a promising start. A couple points: I think it might be sufficient to rename the path from intel_fpga to intel simply since hardware implies under this path FPGA hardware. In addition, there are leftover files that were copied which you can safely remove, such as compile_designs.py (which should also be eliminated from the main branch), vivado.tcl/hsi.tcl/hls.tcl which are all xilinx-specific.

tmoreau89 · 2018-09-11T19:03:56Z

Also I just ordered a couple DE10-nano boards so I can test out the Intel FPGA backend (I should have those by early next week).

@liangfu Our first step should be to generate a complete VTA design with all of the HLS modules, connected via FIFOs, BRAMs, and Avalon bus to the memory controller / ACP port of the ARM SoC. From there on we can build unit tests in C to test basic functionality such as data transfer, and single tensor ops.

liangfu · 2018-09-12T02:48:35Z

@tmoreau89 the latest commit have safely remove those xilinx-specfic script files, and all the instruction components as well as the simulation program have been successfully compiled (not functional yet). I'm on the way to debug the modules and make it functional under simulation mode. However, would you kindly provide full debug log that is functional? This would be helpful to checkout the errors in migrating Xilinx HLS based implement to Intel HLS.

In the mean time, I agree what our first step should be at this stage.

tmoreau89 · 2018-09-12T21:47:16Z

@liangfu I see, what you would like are unit tests for each of the HLS modules to test the functions in isolation? I've been planning to do a simulation infrastructure revamp, but this could take a few days. In the meantime, can you reproduce the simulations using the xilinx toolchains?

There's some guidance on how to run the simulation test. You can turn a DEBUG flag before compiling the design, or insert your own printf statements to obtain a more detailed trace. Let me know if you run into problems.

liangfu · 2018-09-13T12:11:44Z

@tmoreau89 I didn't expect HLS modules tests in isolation, instead, I'm currently running into existing simulation infrastructure. Thanks to your debugging guidance, I've just installed Xilinx toolchains and started to compare the output results, which is helpful in debugging into the migrated version. Good news is that I have successfully migrated ALU modules in simulation mode. However, as Intel HLS don't support volatile in ac_int copy constructor, I've remove volatile keywords everywhere in the code for now.

tmoreau89 · 2018-09-14T05:53:11Z

@liangfu thanks for the update, I'm glad you've been able to test the compute module. Volatile may not be necessary for the Intel toolchains - it was necessary for Vivado since simulation would not behave correctly if the volatile keyword wasn't specified. That being said, I don't think it affected the behavior of the synthesized hardware.

…ano;

liangfu · 2018-09-18T02:43:12Z

@tmoreau89 I've successfully performed gemm in simulation lately, and cleaned up unused code. However, when I generate hardware with the same design, there is a small section that constantly causes hardware generation failure:

// Store to accum memory/store buffer         
if (alu_opcode == VTA_ALU_OPCODE_MIN ||       
    alu_opcode == VTA_ALU_OPCODE_MAX) {       
  acc_mem[dst_idx][i] = cmp_res;              
  out_mem[dst_idx][i] = short_cmp_res;        
} else if (alu_opcode == VTA_ALU_OPCODE_ADD) {
  acc_mem[dst_idx][i] = add_res;              
  out_mem[dst_idx][i] = short_add_res;        
} else if (alu_opcode == VTA_ALU_OPCODE_SHR) {
  acc_mem[dst_idx][i] = shr_res;              
  out_mem[dst_idx][i] = short_shr_res;        
}

The debug level compilation error reports:

Optimizing component(s) and generating Verilog files
PHINode should have one entry for each predecessor of its parent basic block!
  %cmp_res.0.0.0.7 = phi i512 [ %cmp_res.0.0.0.11675, %if.else275 ], [ %cmp_res.0.0.0.11675, %if.else275 ], [ %cmp_res
.0.0.0.11675, %if.then123 ], [ %cmp_res.0.0.0.11675, %if.end431.loopexit ], [ %or.i.i220, %if.end431.loopexit35528 ], 
!dbg !12755
Broken module found, compilation aborted!
0  libLLVM-3.0.so  0x00007fe98ce8532f
1  libLLVM-3.0.so  0x00007fe98ce872a2
2  libpthread.so.0 0x00007fe98c38c330
3  libc.so.6       0x00007fe98b3a3c37 gsignal + 55
4  libc.so.6       0x00007fe98b3a7028 abort + 328
5  libLLVM-3.0.so  0x00007fe98dbd9446
6  libLLVM-3.0.so  0x00007fe98dbb75ef llvm::FPPassManager::runOnFunction(llvm::Function&) + 527
7  libLLVM-3.0.so  0x00007fe98dbb7750 llvm::FPPassManager::runOnModule(llvm::Module&) + 80
8  libLLVM-3.0.so  0x00007fe98dbb7111 llvm::MPPassManager::runOnModule(llvm::Module&) + 577
9  libLLVM-3.0.so  0x00007fe98dbb72bb llvm::PassManagerImpl::run(llvm::Module&) + 187
10 aocl-opt        0x00000000004194dd main + 4765
11 libc.so.6       0x00007fe98b38ef45 __libc_start_main + 245
12 aocl-opt        0x000000000040ccc9
Stack dump:
0.  Program arguments: /DATA2/liangfu/intelFPGA_lite/18.0/hls/linux64/bin/aocl-opt -HLS --grif --soft-elementary-math=

If we can ignore this section temporarily, the generated hardware looks fine. Here is the estimated resource allocation with current hardware design (targeting DE10-Nano):

Component Name	ALUTs	FFs	RAMs	DSPs
compute	50981	49720	285	56
fetch	1326	1047	4	0
load	36949	18013	85	0
store	4101	5355	40	0
Total	93357 (85%)	74135 (34%)	414 (81%)	56 (50%)
Available	109572	219144	514	112

tmoreau89 · 2018-09-18T15:52:53Z

@liangfu thank you for the update, this is looking promising. Would you mind summarizing the commands needed to run the synthesis and simulation for your WIP HLS modules with the Intel toolchains?

liangfu · 2018-09-18T17:01:19Z

Just enable MODE=sim in Makefile, it would use i++ to compile the HLS modules. Ob the other hand, I'm a bit worried about how to drive the generated hardware in software. I'm not quite familiar with this at the moment.

…f each test function);

Conflicts: vta/tests/hardware/metal_test/Makefile_PYNQ.mk vta/tests/hardware/metal_test/metal_test.cc

Conflicts: 3rdparty/HalideIR 3rdparty/dlpack 3rdparty/dmlc-core

stale

liangfu · 2019-04-10T02:21:36Z

@nhynes This PR is still WIP. I would reply to your comments one-by-one, update the requested changes, and request for another round of review when I think this is ready.

liangfu · 2019-05-30T11:01:26Z

There seem to be a concurrent effort at #3258, closing this PR for now.

liangfu added 4 commits August 31, 2018 13:59

initial intel hls based vta backend;

b2465b9

Merge branch 'master' into ihc

73eca3c

working on load module;

5df589f

temp solution using custom memcpy;

cded6fa

liangfu mentioned this pull request Sep 7, 2018

[RFC][VTA] Support Intel FPGA in VTA #1656

Closed

13 tasks

tqchen assigned tmoreau89 Sep 9, 2018

yzhliu added the status: need review label Sep 10, 2018

tmoreau89 reviewed Sep 11, 2018

View reviewed changes

first successful compilation of the functions and simulation code;

593b2d2

fixed a few bugs in simulation mode;

05df101

successful on simulation with ALU modules;

ed4aa32

tmoreau89 added status: WIP status: need review and removed status: need review labels Sep 14, 2018

liangfu added 4 commits September 14, 2018 15:14

minor bug fix;

ff9584e

first successful GEMM simulation;

4b3964f

clean up unused code and optimize generated hardware targeting de10-n…

9b3f17c

…ano;

uncomment the section that causes hardware generation error;

5d5117d

liangfu added 4 commits November 1, 2018 18:21

add cma driver for de10-nano;

dfae6e8

enable de10-nano driver and metal_test application;

799d729

update test_lib;

f07720e

add de10-nano support in hardware/common/test_lib

c0f2b60

liangfu added 15 commits March 11, 2019 20:00

Merge remote-tracking branch 'origin/devel' into ihc

da67b65

bug fix in SHR operator definition, and remove generated verilog files;

f6dd095

Merge remote-tracking branch 'origin/devel' into ihc

bdb8e6d

bug fix in computing gemm;

3cd674d

bug fix in computing GEMM in intel hls;

76d6b24

[IMPORTANT] Successful computation of GEMM using Intel hls;

fe98977

initial implement of load module;

052191f

finish implement and testing with load module;

e491e2f

add more assertions in testing with load module;

61948bf

rename variables in compute module test script (to show the meaning o…

1a2e072

…f each test function);

Merge remote-tracking branch 'origin/devel' into ihc

1594611

Merge branch 'master' into devel

135c4b4

Conflicts: vta/tests/hardware/metal_test/Makefile_PYNQ.mk vta/tests/hardware/metal_test/metal_test.cc

Merge branch 'devel' into ihc

94e0233

Merge remote-tracking branch 'origin/devel' into devel

beb54eb

Conflicts: 3rdparty/HalideIR 3rdparty/dlpack 3rdparty/dmlc-core

Merge remote-tracking branch 'origin/devel' into ihc

d7fe885

nhynes self-requested a review April 9, 2019 16:11

liangfu added 2 commits April 10, 2019 09:28

add 3rdparty dependencies;

cad8af7

Merge remote-tracking branch 'origin/devel' into ihc

7f16f28

liangfu and others added 7 commits April 22, 2019 11:20

use for-loops to define ALU test-cases;

49b8c23

processing with limited out_mem fifo size;

0b99532

improved out_mem_fifo;

44516b5

improved out_mem_fifo;

45bd818

improved out_mem_fifo;

776d8c6

Merge remote-tracking branch 'origin/devel' into ihc

fdc4270

Merge remote-tracking branch 'origin/master' into ihc

0b5c035

liangfu closed this May 30, 2019

liangfu mentioned this pull request Jun 19, 2019

[VTA] de10-nano driver #3394

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][VTA] Support Intel FPGA in VTA #1694

[WIP][VTA] Support Intel FPGA in VTA #1694

liangfu commented Sep 7, 2018

tmoreau89 left a comment

tmoreau89 commented Sep 11, 2018

liangfu commented Sep 12, 2018

tmoreau89 commented Sep 12, 2018

liangfu commented Sep 13, 2018

tmoreau89 commented Sep 14, 2018

liangfu commented Sep 18, 2018

tmoreau89 commented Sep 18, 2018

liangfu commented Sep 18, 2018

liangfu commented Apr 10, 2019

liangfu commented May 30, 2019

[WIP][VTA] Support Intel FPGA in VTA #1694

[WIP][VTA] Support Intel FPGA in VTA #1694

Conversation

liangfu commented Sep 7, 2018

tmoreau89 left a comment

Choose a reason for hiding this comment

tmoreau89 commented Sep 11, 2018

liangfu commented Sep 12, 2018

tmoreau89 commented Sep 12, 2018

liangfu commented Sep 13, 2018

tmoreau89 commented Sep 14, 2018

liangfu commented Sep 18, 2018

tmoreau89 commented Sep 18, 2018

liangfu commented Sep 18, 2018

liangfu commented Apr 10, 2019

liangfu commented May 30, 2019