Skip to content

Dev Board Setup

Julian Kemmerer edited this page Jan 21, 2025 · 73 revisions

Video

pcoverview

Setup

image

The recommended way of getting started is to

  1. begin with the Verilog or VHDL 'blink an LED' example that comes with your development board.
    • You likely will not need to instantiate a PLL and can use a clock source provided by your board.
  2. Once that entire flow is confirmed working and you have your LED blinking in hardware, swap out the hand written HDL for PipelineC generated code.
    • PipelineC generates a single top level module. Try pipelinec examples/blink.c using blink.c.
    • The top level module name --top is set to top by default.

The below sections detail getting started on a few common development platforms:

Before You Buy

You may be able to find old FPGA dev boards on eBay and such. Be aware that some FPGAs required paid licenses to use their tools. Also be aware old FPGAs may no longer have current versions of the required tooling. You might need to go search for an archived version of the tool and use some old OS in a VM: not recommended. Instead, double check the FPGA you want to buy has a current version of tooling that you can download and use on your modern OS.

  • Lattice FPGAs: OSS CAD Suite, or Lattice tooling
  • Xilinx FPGAs: Vivado tool

Your FPGA dev board should come with a pinout listing what each FPGA pin is connected to on the board. A nice to have is the actual board schematic itself, but the pinout list is a must.

  • Lattice FPGAs: .pcf files with pin locations
  • Xilinx FPGAs: .xdc files with pin locations

Finally, your dev board should come with some kind of 'from the factory' HDL(Verilog or VHDL) demo you can build and upload to the board. Typically this is simply blinking an LED.

Download Tools

Download the tools needed:

Then see instructions for telling PipelineC about your install setup.

Gather Dev Board Files

Working with an FPGA requires HDL to describe the hardware, and pin constraints to say how the designs maps into the FPGA on your board. Finally, you will want to be familiar with the flow for going from HDL+constraints to final bitstream you intend to upload to the board for test.

Pico-Ice Files

The pico-ice board has a Lattice ice40 FPGA. The ice_makefile_blinky example is a perfect starting point. It provides the FPGA pinout file ice40.pcf and example Verilog for blinking an LED.

Building the HDL to bitstream is simple using the provided Makefile based flow described in the README. Critically, simply have the OSS CAD Suite installed and set the OSS_CAD_SUITE environment variable. The output of the build is a bitstream file you can upload to the FPGA.

Arty Files

The Digilent Arty boards have Xilinx 7 Series FPGAs. Their GitHub provides many HDL tutorials+examples and includes the required pin constraint .xdc files.

For instructions on how to incorporate HDL and constraints into a Vivado project you can build, the Digilent Vivado tutorial is recommended.

Everything not-PipelineC

It is recommended to get your dev board's build flow working completely separate from PipelineC before starting. Getting familiar with seeing Verilog or VHDL will hopefully be similar to how you occasionally need to write assembly for your microcontroller. Getting this working proves that the whole flow, power supply, power cable, programming cable, etc all work.

You want to pay attention to how the HDL source files and pin IO constraints are handled in the example projects you find.

  • For the pico-ice make flow: HDL(.sv) is processed by yosys for synthesis and pin constraints (.pcf) are used by nextpnr for place and route.
  • For the Arty flow with Vivado: HDL(.vhd) and constraint(.xdc) files are added to the project before running synthesis or place and route.

Power and Programming

Both the pico-ice and Arty boards can be powered and programmed with a single USB cable.

  • The pico-ice build flow via make prog uses dfu-util to upload the gateware.bin bitstream to the FPGA.
  • Vivado's Generate Bitstream button and Hardware Manager can be used to produce the final bitstream .bit file and upload to the FPGA.

Clocks and PLL Generated Clocks

Almost all dev boards will provide at least one on-board always running clock for your FPGA to use.

  • On the pico-ice board this is the Pin 35 default 12MHz clock provided by the Raspberry Pi as noted in the pin IO .pcf file.
  • On Arty boards there is typically a CLK100MHZ 100MHz clock defined in the constraints .xdc file.

If other clock rates are needed then a PLL must be configured and instantiated:

In both cases, the resulting setup uses the on-board always running clock as input to the PLL. The PLL will be instantiated in the same area of VHDL or Verilog where your original blinking demo was. The output clock(s) from the PLL can be used once the PLL 'lock' signal is asserted. Typically 'not locked' is used as a reset condition, holding the design in reset until the PLL is stable.

PipelineC Intro

One you are confident your VHDL/Verilog blinking/basic board setup is working, it is time to switch over to incorporating PipelineC into your design. Now would be the time to ensure the PipelineC tool can find your install locations. Feel free to also check out some initial basic digital logic examples in addition to what's shown below.

  • The ice_makefile_pipelinec example is based on the original ice_makefile_blinky example, but has been modified to include PipelineC in the build flow as described in the README file.
    • top.h configures the top level IO to be used (ex. which pin is LED). This needs to match the .sv wrapper and .pcf files mentioned below.
    • top.c is where the 'main' logic for counting off to blink LEDs lives.
    • top.sv is the wrapper around PipelineC where things like PLL modules are instantiated.
    • ice40.pcf is the modified version of the pin out for this design.
    • The Makefile orchestrates the build and can upload to the board, inside you will see:
      • The pipelinec tool is run on the input top.c file. This produces a list of generated VHDL files pipelinec_output/vhdl_files.txt
      • icepll is run to generate the pll.v file instantiated in top.sv
      • GHDL and yosys are invoked with the list of VHDL files and top.sv wrapper. This produces a .json file netlist.
      • nextpnr uses the netlist .json file along with pin constraint .pcf file to produce an output .asc file
      • icepack is used to convert the .asc file to a .bin file.
      • make prog: invokes bin2uf2 and dfu-util to first convert the binary file to a format that can be uploaded to the pico-ice board, and then finally upload it to the device and boot the image.
    • The full build and program can be done like so make clean all PIPELINEC_REPO=/path/to/PipelineC OSS_CAD_SUITE=/path/to/oss-cad-suite && make prog
  • Arty board example files can be found in the arty examples dir.
    • top.h configures the top level IO to be used (ex. which pin is LED). This must match the .vhd wrapper and .xdc files mentioned below.
    • top.c is where the 'main' logic for counting off to blink LEDs lives.
    • board.vhd is the top level wrapper around PipelineC used for many Arty demos. It includes places where various different IO has been used and since commented out. As well as several clock wizard PLL instances producing various clock rates.
    • Master.xdc is a file copied from Digilent and has since had various IO commented in/out.
    • The board.vhd and .xdc constraints are included inside a Xilinx Project File arty/_100t.xpr. This project file will easily become old/stale and recommended to just make a new project in your version of Vivado if needed.
    • Uncomment ports in side board.vhd and the .xdc file. Connect these signals to the the PipelineC top level module called 'top'.
    • Proceed with standard Vivado build flow, ex. can click Generate Bitstream

Top Level Inputs and Outputs

The easiest way to declare top level inputs and outputs for your design is to use DECL_IN/OUTPUT helper macros from "compiler.h". Otherwise, top level IO is derived from the MAIN functions you write.

#include "compiler.h"
DECL_INPUT(uint1_t, my_input)
DECL_OUTPUT(uint1_t, my_output)

The above PipelineC generates VHDL like:

my_input_val_input : in unsigned(0 downto 0);
my_output_return_output : out unsigned(0 downto 0);

By default PipelineC names clock ports with the rate included, ex.

clk_12p0 : in std_logic; -- A 12MHz clock

Override this behavior by creating an input with a constant name and telling the tool that input signal is a clock of specific rate:

#include "compiler.h"
DECL_INPUT(uint1_t, pll_clk)
CLK_MHZ(pll_clk, PLL_CLK_MHZ)
DECL_INPUT(uint1_t, pll_clk_reset)

will produce VHDL ports like:

-- Clock input from PLL
pll_clk_val_input : in unsigned(0 downto 0);
-- 'Not locked' signal from PLL generating the clock
pll_clk_reset_val_input : in unsigned(0 downto 0);

Tri-State and Other IO

Often a specialized network or memory controller IP will want 'direct control' or to be 'directly connected' to the FPGA top level IO signal. This can often be for using specialized IO like DDR or SERDES, as well as for tri-state high impedance signalling. These modules should be instantiated outside of PipelineC in the wrapper HDL where PLLs and such also exist. The interface exposed by those modules can be connected to 'regular' unidirectional PipelineC inputs and outputs.

Counter Example

counter.c example:

#include "uintN_t.h"  // uintN_t types for any N
#include "debug_port.h" // For sim and hardware debug

// Install+configure synthesis tool then specify part here
#pragma PART "ICE40UP5K-SG48" // ice40 (pico-ice)
//#pragma PART "xc7a100tcsg324-1" // Artix 7 100T (Arty)

DEBUG_OUTPUT_DECL(uint32_t, counter_debug)

// 'Called'/'Executing' every 40ns (25MHz)
#pragma MAIN_MHZ main 25.0
uint32_t main()
{
  // static = registers
  static uint32_t the_counter_reg;
  // debug signals connect to extra top level ports
  counter_debug = the_counter_reg;
  // printf's work in simulation
  printf("Counter register value: %d\n", the_counter_reg);
  // an adder
  the_counter_reg += 1;
  // connection to output port
  return the_counter_reg;
}

Notice how the_counter_reg goes through an adder += before going out the output return port.

The below version instead uses the_counter_reg as the output register directly (no adder in path to output port):

  // Version where register connects directly to output
  static uint32_t the_counter_reg;
  uint32_t output = the_counter_reg;
  // an adder
  the_counter_reg += 1;
  // connection to output port
  return output;

A similar example to try next is using a counter to blink an LED.

FSM Example

fsm.c is a simple finite state machine example. It moves from the first state A, to state B, then finally to state C.

#include "uintN_t.h"  // uintN_t types for any N

// Install+configure synthesis tool then specify part here
#pragma PART "ICE40UP5K-SG48" // ice40 (pico-ice)
//#pragma PART "xc7a100tcsg324-1" // Artix 7 100T (Arty)

// State enum definition
typedef enum my_state_t{
  STATE_A,
  STATE_B,
  STATE_C
}my_state_t;

// 'Called'/'Executing' every 40ns (25MHz)
#pragma MAIN_MHZ main 25.0
typedef struct my_fsm_outputs_t{
  // Module output signals
  char some_output_signal;
}my_fsm_outputs_t;
my_fsm_outputs_t main(
  // Module input signals
  uint1_t some_input_signal
){
  // static = registers
  static my_state_t state; // state register
  // output wires
  my_fsm_outputs_t outputs;
  // State machine logic
  if(state==STATE_A){
    printf("State A!\n");
    outputs.some_output_signal = 'A';
    state = STATE_B;
  }else if(state==STATE_B){
    printf("State B!\n");
    outputs.some_output_signal = 'B';
    state = STATE_C;
  }else{// if(state==STATE_C){
    printf("State C!\n");
    outputs.some_output_signal = 'C';
    if(some_input_signal){
      state = STATE_A;
    }else{
      printf("Not starting over yet!\n");
    }
  }
  return outputs;
}

A real world example of using FSMs can be found in receiving and transmitting UART. After that, the state machine to produce a VGA signal is another good example.

Pipeline Example

pipeline.c is a simple pure stateless function that adds two floating point numbers. If a synthesis tool is installed, the floating point addition will be pipelined to meet the FMAX target MAIN_MHZ on FPGA PART specified.

// A II=1 pipeline from a pure C function
// (no globals or local static vars)

// Set FPGA part/synthesis tool see
#pragma PART "LFE5UM5G-85F-8BG756C" // Example Lattice ECP5 part

#include "intN_t.h"
#include "uintN_t.h"

#pragma MAIN_MHZ my_pipeline 100.0
float my_pipeline(float x, float y)
{
  return x + y;
}

For more information see the pipeline section of the Quick Start.

Most of the time you will need to describe more than just pipelines. Ex. where do pipeline inputs come from? Where do pipeline outputs go? Because you are often talking to state machines, RAMs, and other part of your design, it is convenient to wrap up a pipeline instance in a way that it can be easily connected to any other area:

Using the GLOBAL_VALID_READY_PIPELINE_INST macro from global_func_inst.h you can declare a pipeline with globally visible data, valid, ready streaming handshake wires.

#define GLOBAL_VALID_READY_PIPELINE_INST(inst_name, out_type, func_name, in_type, MAX_IN_FLIGHT)
...

// Ex. an instance of a function to compute the square root of a number
// my_sqrt: float sqrt(float) pipeline with 16 max in flight operations at once
GLOBAL_VALID_READY_PIPELINE_INST(my_sqrt, float, sqrt, float, 16)

// Declares global wires to interface with pipeline like
stream(float) my_sqrt_in; // input to pipeline, data+valid
uint1_t my_sqrt_in_ready; // output from pipeline, ready
stream(float) my_sqrt_out; // output from pipeline, data+valid
uint1_t my_sqrt_out_ready; // input to pipeline, ready

Simulation

The VHDL 2008 generated by PipelineC will work with any VHDL simulator. See more about running the tool for other output options.

PipelineC Template Simulations

To showcase basic simulator support and guide you in creating testbenches, PipelineC is able to create and run simple template simulations for some tools. By default a clock cycle counter, any DEBUG connected signals, and printf's show up in the console output.

For example, using OSS CAD Suite to run a GHDL simulation via a Python described testbench in cocotb: ./src/pipelinec examples/counter.c --comb --sim --ghdl --cocotb:

Clock:  0
counter_debug = 00000000000000000000000000000000
Counter register value: 0

Clock:  1
counter_debug = 00000000000000000000000000000001
Counter register value: 1

Clock:  2
counter_debug = 00000000000000000000000000000010
Counter register value: 2

...

Standard waveforms .vcd files are output into the pipelinec_output/cocotb directory and can be viewed with programs like: GTKWave: GTKWave example Surfer: Surfer example

Modelsim can be started with --comb --sim --modelsim, only printf's show up in the console here:

force -freeze sim:/top/clk_25p0 1 0, 0 {50 ps} -r 100
add wave -position end  sim:/top/clk_25p0
add wave -position end  sim:/top/main_0CLK_23f04728/the_counter_reg
run
# Counter register value: 0
run
# Counter register value: 1
run
# Counter register value: 2
run
# Counter register value: 3

image

Verilator can be run with --comb --sim --verilator and internally first uses GHDL and yosys to convert PipelineC output VHDL to Verilog. In doing so, printf's are lost, and only DEBUG connected signals are printed to console:

cycle 0: counter_debug: 0 
cycle 1: counter_debug: 1 
cycle 2: counter_debug: 2 
...

Synthesis and Place and Route

Synthesis converts your HDL into a netlist of FPGA elements. Place and route finds positions for and connects the elements together. The VHDL 2008 generated by PipelineC will work with any synthesis tool. See more about running the tool for other output options.

Resources

Synthesizing a design will tell you how your HDL has been compiled down into FPGA resources and you should be familiar with how to read these logs. In this case of synthesizing the counter example, a handful of lookup tables (LUTs) and Flip-Flops(FFs) are used:

  • yosys output includes resource summary section like so:
   Number of wires:                 91
   Number of wire bits:            973
   Number of public wires:          91
   Number of public wire bits:     973
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:                158
     SB_CARRY                       30
     SB_DFF                         96
     SB_LUT4                        32

and nextpnr prints something similar:

Info: Device utilisation:
Info: 	         ICESTORM_LC:   131/ 5280     2%
Info: 	        ICESTORM_RAM:     0/   30     0%
Info: 	               SB_IO:    65/   96    67%
Info: 	               SB_GB:     1/    8    12%
Info: 	        ICESTORM_PLL:     0/    1     0%
Info: 	         SB_WARMBOOT:     0/    1     0%
Info: 	        ICESTORM_DSP:     0/    8     0%
Info: 	      ICESTORM_HFOSC:     0/    1     0%
Info: 	      ICESTORM_LFOSC:     0/    1     0%
Info: 	              SB_I2C:     0/    2     0%
Info: 	              SB_SPI:     0/    2     0%
Info: 	              IO_I3C:     0/    2     0%
Info: 	         SB_LEDDA_IP:     0/    1     0%
Info: 	         SB_RGBA_DRV:     0/    1     0%
Info: 	      ICESTORM_SPRAM:     0/    4     0%
  • Vivado can be asked to report utilization and full reports with sections like so are generated:
Report Cell Usage: 
+------+-------+------+
|      |Cell   |Count |
+------+-------+------+
|1     |CARRY4 |    16|
|2     |LUT1   |     2|
|3     |FDRE   |    96|
+------+-------+------+

Timing

Part of solving the place and route problem is making sure the signal propagating between FPGA elements still meet the timing requirements of the circuit imposed by your selected operating frequency FMAX target.

Sometimes this problem is not possible to solve. You will have 'failed to meet timing'.

  • For example setting nextpnr to an impossible timing goal (Ex. 400MHz on an ice40) will show a log file message like: ERROR: Max frequency for clock 'clk_400p0$SB_IO_IN_$glb_clk': XX.XX MHz (FAIL at 400.00 MHz) followed by a report of which path in the design had the worst timing:
Info: Critical path report for clock 'clk_400p0$SB_IO_IN_$glb_clk' (posedge -> posedge):
Info: curr total
Info:  1.4  1.4  Source main_0clk_23f04728.bin_op_plus_counter_c_l21_c3_55ac.left_SB_DFF_Q_31_DFFLC.O
Info:  1.8  3.2    Net main_0clk_23f04728.bin_op_plus_counter_c_l21_c3_55ac_left[0] budget -5.007000 ns (12,1) -> (13,1)
Info:                Sink $nextpnr_ICESTORM_LC_0.I1
...
Info:  1.8 17.1    Net main_return_output_SB_DFF_Q_D budget -5.006000 ns (13,4) -> (14,4)
Info:                Sink main_0clk_23f04728.bin_op_plus_counter_c_l21_c3_55ac.left_SB_DFF_Q_DFFLC.I0
Info:  1.2 18.4  Setup main_0clk_23f04728.bin_op_plus_counter_c_l21_c3_55ac.left_SB_DFF_Q_DFFLC.I0
Info: 12.5 ns logic, 5.9 ns routing
  • When Vivado fails to meet the timing requirements you will see implementation log messages like so
CRITICAL WARNING: [Timing 38-282] The design failed to meet the timing requirements. Please see the timing summary report for details on the timing violations.

and a failed red timing value in window failed timing gui text

and again, the tool can be asked to report timing which includes a breakdown of the path in the design had the worst timing:

Slack (VIOLATED) : -0.553ns  (required time - arrival time)
Source: main_0CLK_23f04728/the_counter_reg_reg[1]/C
  (rising edge-triggered cell FDRE clocked by clk_400p0  {rise@0.000ns fall@1.250ns period=2.500ns})
Destination: main_0CLK_23f04728/the_counter_reg_reg[29]/D
  (rising edge-triggered cell FDRE clocked by clk_400p0  {rise@0.000ns fall@1.250ns period=2.500ns})
Requirement:            2.500ns  (clk_400p0 rise@2.500ns - clk_400p0 rise@0.000ns)
Data Path Delay:        3.045ns  (logic 2.174ns (71.396%)  route 0.871ns (28.604%))
Logic Levels:           8  (CARRY4=8)

PipelineC Internal Synthesis Runs

Pipelining is one solution for making a large state-less combinatorial function meet timing. By default the PipelineC tool will try to pipeline areas of your design that are not stateful functions.

To disable pipelining and leave combinatorial logic untouched use the --comb flag.

To disable internal synthesis runs and just quickly produce output VHDL use --comb --no_synth.

When the PipelineC tool is configured to see installed synthesis tools it will use them internally as part of it's pipelining process. The outputs of these internal synthesis runs are parsed by the PipelineC tool and used to make design changes and iterate attempting to meet the timing requirements.

If the tool fails to meet the timing requirements you will see a print out like so of the troublesome path from the PipelineC tool:

Cannot pipeline path to meet timing:
START:  main_0CLK_23f04728/the_counter_reg_reg[1] =>
 ~ 3.053 ns of logic+routing ~
END: => main_0CLK_23f04728/the_counter_reg_reg[29]
Giving up...

For more information see the pipeline section of the Quick Start.

For more information on how to configure the tool and where to find output files see the running the tool page.

More Examples, Next Steps, Questions?

See a bigger list of examples here.

Have questions? Want to chat? Stop by the PipelineC Discord or start a discussion, always happy to help :) -Julian

Clone this wiki locally