Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PolarFire FPGA support #4474

Merged
merged 25 commits into from
Jul 29, 2024
Merged

Add PolarFire FPGA support #4474

merged 25 commits into from
Jul 29, 2024

Conversation

tony-min-1
Copy link
Contributor

What are the reasons/motivation for this change?

  • Adding new support for PolarFire FPGA as synth_mchp pass

Explain how this is achieved.

  • The overall flow is very similar to synth_xilinx.
  • Added DSP and RAM inference based on PolarFire macro library guide and user fabric guide
    • DSP inference achieved using pmgen, RAM inference uses memory_libmap pass
  • Rest of simulation model taken from existing SmartFusion2 support because it has similar architecture to PolarFire

If applicable, please suggest to reviewers how they can test the change.

  • Added some basic verilog tests. Ran equivalence check on the synthesized output Verilog

Credits to Damon Ma for adding the Verilog tests + code cleanups

Copy link
Collaborator

@Ravenslofty Ravenslofty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there's a lot of code here. Here are some general thoughts before going over this with a fine-toothed comb.

  • Yosys always spells out the name of vendors and families; synth_lattice, synth_ice40. so, the mchp and pf abbreviation should be microchip and polarfire.
  • architecture-specific tests go in tests/arch/XXX, rather than techlibs/XXX/tests. I suggest referencing the generic testbench files in tests/arch/common, and also looking at how synthesis testbenches are idiomatically formed.

techlibs/mchp/cells_sim.v Outdated Show resolved Hide resolved
@tony-min-1
Copy link
Contributor Author

So, there's a lot of code here. Here are some general thoughts before going over this with a fine-toothed comb.

  • Yosys always spells out the name of vendors and families; synth_lattice, synth_ice40. so, the mchp and pf abbreviation should be microchip and polarfire.
  • architecture-specific tests go in tests/arch/XXX, rather than techlibs/XXX/tests. I suggest referencing the generic testbench files in tests/arch/common, and also looking at how synthesis testbenches are idiomatically formed.
  • changed all references of mchp to microchip and pf to polarfire
  • Moved all tests into tests/arch/microchip and setup the tests in same way as other architecture specific tests
    • enabled microchip tests in Makefile

@tony-min-1 tony-min-1 requested a review from Ravenslofty July 4, 2024 20:52
Copy link
Collaborator

@Ravenslofty Ravenslofty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the DSP testcases should be merged together to avoid too much litter.

But here's a first pass review.

endspecify
endmodule

(* abc9_lut=2 *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me, though I'm unfamiliar with the PolarFire. If you have a LUT4 architecture I would expect everything to have an area of 1, which implicitly tries to make ABC9 use LUT4s for as much logic as possible. Does it really require twice as much area for a LUT2 than a LUT1, and three times as much area for a LUT3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it should just be an area of 1 for all LUTs in this architecture.

input A, B, C, D,
output Y
);
assign Y = A & B & C & D;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there dedicated hardware for this, or are you expecting this to map to soft logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the macro library, I would expect this to map to a LUT4.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but as implemented, it won't; you will need to techmap it into soft logic, or alternatively directly into a LUT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libero actually recognizes these macros. You can manually instantiate these cells.

techlibs/microchip/cells_sim.v Show resolved Hide resolved
if (it_CE != bit_to_lut.end())
worthy_post_ce = true;
} else if (sig_CE.data != State::S1) {
// Strange. Should not happen in a reasonable flow, so bail.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a log_assert on this to fail noisily if this occurs?

log(" synth_microchip [options]\n");
log("\n");
log("This command runs synthesis for Microchip FPGAs. Operating on\n");
log("partly selected designs is not supported (you must submit a fully-selected \n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we imply this and check it with an assert; it makes the help a little easier to understand.

Copy link
Contributor Author

@tony-min-1 tony-min-1 Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove this line. There is already a check for fully selected design + error msg if design is not fully selected

OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/

module large_mult(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tests/arch/common/mul.v

OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/

module ram_SDP(data,waddr,we,clk,q);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test semantics aren't exactly the same, but this looks like sync_ram_sdp from tests/arch/common/blockram.v

OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/

module ram_TDP (clka,clkb,wea,addra,dataina,qa,web,addrb,datainb,qb);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync_ram_tdp from tests/arch/common/blockram.v

assign dout = mem[addr_reg];

always@(posedge clk) begin
addr_reg <= addr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's...an odd way to write this. I'm a little confused about the intended semantics here.

Copy link
Contributor Author

@tony-min-1 tony-min-1 Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is intended to be functionally same as sync_ram_sp from tests/arch/common/blockram.v . I've changed to use sync_ram_sp instead.

OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/

module widemux(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mux8 from tests/arch/common/mux.v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mux4 generated using case is a little different. Having a bit of trouble inferring MX4 macro

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the trouble, may I ask?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm interested in $_MUX4_ cells. To discover this cell structure, I use muxcover to pack $_MUX_ trees into $_MUX4_. With the case statement, you get $pmux instead of a tree of $_MUX_. I couldn't find a way to cleanly convert $pmux into $mux tree. I have tried using pmuxtree and techmap -map +/pmux2mux.v but there is always additional logic (i.e. $eq) driving the select control of $mux, which I assume makes it difficult for muxcover to pack into $_MUX4_

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the answer is that you're calling muxcover too early. The reason you're not getting any $_MUX4_ cells is because you have no $_MUX_ cells which you get from techmapping $mux cells.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before muxcover, I call simplemap t:$mux to do the conversion from $mux to $_MUX_ cells. And before simplemap t:$mux is where I experimented with the $pmux -> $mux conversions.

* area should be 1 for all LUTs

* clean up macros

* add log_assert to fail noisily when encountering oddly configured DFF

* clean help msg

* flatten set to true by default

* update

* merge mult tests

* remove redundant test

* move all dsp tests to single file and remove redundant tests

* update ram tests

* add more dff tests

* fix c++20 compile errors

* add option to dump verilog

* default to use abc9

* remove -abc9 option since its the default now

---------

Co-authored-by: tony <minchunlin@gmail.com>
@tony-min-1
Copy link
Contributor Author

I think all the DSP testcases should be merged together to avoid too much litter.

But here's a first pass review.

Thanks for the feedbacks, I have made the following changes:

  • Organize similar tests together in single file to reduce clutter
  • Re-use existing tests from tests/arch/common
  • Fix c++20 compile error
  • use abc9 by default
  • flatten hierarchy by default
  • add vlog option to dump Verilog
  • various cleanups

@tony-min-1 tony-min-1 requested a review from Ravenslofty July 8, 2024 15:05
Copy link
Collaborator

@Ravenslofty Ravenslofty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last few things I see. maybe the arch-specific tests could be inlined into the yosys scripts that use them; that would cut down the PR line count a bit more.

log(" Run 'abc' with '-D 1' option to enable flip-flop retiming.\n");
log(" implies -dff.\n");
log("\n");
log(" -abc\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is conventionally -noabc9.

else if (abc9) {

std::string abc9_opts;
// for the if command in abc to specify wire delay between adjects LUTs (default = 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// for the if command in abc to specify wire delay between adjects LUTs (default = 0)
// for the if command in abc to specify wire delay between adjacent LUTs (default = 0)

OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*/

module widemux(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the trouble, may I ask?

@tony-min-1 tony-min-1 requested a review from Ravenslofty July 8, 2024 21:05
@tony-min-1
Copy link
Contributor Author

  • Inlined all tests into .ys script.
  • Realized that PolarFire DFF does not support init. Added switch to remove init attribute, this was done the same way as in SmartFusion2 pass.

@tony-min-1
Copy link
Contributor Author

Is there any other changes we need to make?

@Ravenslofty Ravenslofty added the merge-after-jf Merge: PR will be merged after the next Dev JF unless concerns are raised label Jul 18, 2024
@nakengelhardt nakengelhardt merged commit 9f869b2 into YosysHQ:main Jul 29, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-after-jf Merge: PR will be merged after the next Dev JF unless concerns are raised
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants