Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gowin: Himbaechel arch #1184

Merged
merged 39 commits into from
Aug 31, 2023
Merged

gowin: Himbaechel arch #1184

merged 39 commits into from
Aug 31, 2023

Conversation

yrabbit
Copy link
Contributor

@yrabbit yrabbit commented Jul 1, 2023

gowin P&R implementation as Himbaechel sub-architecture.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- wires, nodes and whites are generated from bases - apicula;
- roting of SN and EW bidirectional wires is supported;
- supports "wrapping" the wires at the edges of the chip;
- LUT1-4 and two types of DFF(R) are supported;
- simple IO is supported.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- VSS and VCC sources in each cell are used;
- constant LUT inputs are disabled;
- putting the class declaration into a header file.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Himbaechel-gowin has learned how to place DFFs of all types by tracking
the compatibility of CLK, CE and LSR inputs, as well as placing mutually
compatible flip-flops in adjacent slices.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Run clang-format

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
assign_cell_info();
void GowinImpl::prePlace()
{
ctx->cells.at(ctx->id("leds_OBUF_O"))->setAttr(ctx->id("BEL"), std::string("X0Y14/IOBA"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is test code, but testing with multiple designs would probably be easier manually instantiating the IO buffers and then setting the bel attribute in the Verilog

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know :) will be fixed


// Constrain directly connected LUTs and FFs together to use dedicated resources
int lutffs = h.constrain_cell_pairs(
pool<CellTypePort>{{id_LUT1, id_F}, {id_LUT2, id_F}, {id_LUT3, id_F}, {id_LUT4, id_F}},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a style point, probably make these pool<CellTypePort> variables rather than having them inline as long initialiser lists in the function parameters.

@gatecat
Copy link
Member

gatecat commented Jul 1, 2023

Some tiny comments, but overall I'm really happy with the direction this is going in!

@yrabbit
Copy link
Contributor Author

yrabbit commented Jul 1, 2023

Some tiny comments, but overall I'm really happy with the direction this is going in!

I'll need to get to pins, banks, and packages soon.

And would really like to know the name of the loaded chipdb sometime

@gatecat
Copy link
Member

gatecat commented Jul 1, 2023

And would really like to know the name of the loaded chipdb sometime

chip_info->name.get() should work

@yrabbit
Copy link
Contributor Author

yrabbit commented Jul 1, 2023

Ok. I missed that.

- MUX2_LUT5, MUX2_LUT6, MUX2_LUT7 and MUX2_LUT8 support;
- storing a common class of files in extra_data;
- misc fixes.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
@gatecat
Copy link
Member

gatecat commented Jul 4, 2023

I would suggest planning to split gowin.cc into multiple files eventually, in particular a pack.h/pack.cc that contain the packing-related steps. The structural approach of the packer helper class taken in the FABulous Viaduct arch: https://github.com/YosysHQ/nextpnr/blob/master/generic/viaduct/fabulous/pack.cc should be applicable here too.

yrabbit added 16 commits July 5, 2023 12:49
- Added support for ALU running in "2" ADDSUB mode, the mode that yosys generates for gowin;
- Supports specifying an arbitrary input carry as well as passing the output carry to logic;
- A small restructuring of the source files.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
    - RAM16SDP1, RAM16SDP2 and RAM16SDP4 support;
    - Reading in these primitives is asynchronous, but we have taken
      measures so that DFF Bels remain unoccupied and they can be used
      to implement synchronous reading.
    - misc fixes.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Shamelessly adapted gatecat's router.
Very early version, not yet puzzled with recognizing clock sources and
controlling the type of wires involved.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The pin data is only being populated so far, but not used.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- minor fixes for pinout saving;
- CST parser taken from generic-based apicula;
- $nextpnr IOB detachment is copied here because it is necessary to copy
  attributes from deleted bels.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
On GW2A chips, the global set/reset is in its own cell.
Also corrections to ALU generation.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Use information about pin functions in the clock router.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- The global router is modified to work out the routing of PLL outputs and inputs;
- Added API function to change wire type after its creation - there was
  a need to unify all wires included in the node at the stage of node
  creation, when all wires have already been created.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
 To implement unusual IOs that have a dynamically changing configuration
 it is convenient to store the switching method in the additional chip
 data.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Add processing IO located on the sides of some chips. These are IOBUF,
which are converted into IBUF and OBUF not by fuses, but by signaling to
OE.

Also added the creation of a Global Set / Reset for all chips, instead
of a list of tile types, information from the apicula database is used,
and minor fixes.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- experiment with notifyBelChange as an auxiliary cells reservation mechanism;
- since HCLK pips depend on the coordinates, and not on the tile type,
  the tile type is copied if necessary;
- information about supported types of differential IO primitives has
  been added to the extra information of the chip;

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
As well as the implementation of all OSC primitives.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
A single mechanism for creating a new type of tile if special functions
are found in the chip database that depend on the coordinates of the
tile.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Information about what function (main or auxiliary) the cell performs in
these primitives is transmitted through the tile's extra data. And this
also allows us to remove the calculation of the coordinates of the
auxiliary cell on the go.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
And also fix the clock router to allow (with a warning) non-dedicated
routing in case of false detection of clock wires.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
}
if (adj_ff) {
// DFFs must be same type or compatible
if (ff->type != adj_ff->type && ((ff->type.in(id_DFFS) && !adj_ff->type.in(id_DFFR)) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be neater to put this in a ffs_compatible inline function or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be neater to put this in a ffs_compatible inline function or similar?
This is really a one-time piece ... well, I'll move it tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be neater to put this in a ffs_compatible inline function or similar?

Done

- OSER4 can be located in neighboring IOs;
- PLLVR also needs to rename the inputs.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
OSER16/IDES16 placement issue reports now indicate which location is
having trouble.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
In these chips, the midline IOs are still simple, but are no longer just
IOBUF - that is, unlike the GW1N-1 IBUF and OBUF are not obtained by
applying a signal to the OEN input.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Using  extra cell functions to mark disabled units using the PLL example.

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Install the Himbaechel gowin chipdb .bin files to
/usr/local/nextpnr/himbaehel/gowin

Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
@yrabbit yrabbit marked this pull request as ready for review August 23, 2023 08:41
@yrabbit yrabbit requested a review from gatecat August 24, 2023 02:14
Copy link
Member

@gatecat gatecat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good! Most of the comments are really small changes. I'm happy to merge them straight away, then I think there are a few changes on my side next:

  • timing support in himbächel
  • better user experience for building and finding chipdbs at runtime (similar to how the yosys share directory finding approach works)
  • improved ux for deduplicated global constants


find_package(Python3 3.5 REQUIRED COMPONENTS Interpreter)
set(ALL_HIMBAECHEL_GOWIN_DEVICES GW1N-1 GW1NZ-1 GW1NS-2 GW1N-4 GW1N-9 GW1N-9C GW1NS-4 GW2A-18 GW2A-18C)
set(HIMBAECHEL_GOWIN_DEVICES ${ALL_HIMBAECHEL_GOWIN_DEVICES} CACHE STRING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have this default to none for now so building nextpnr himbaechel doesn't require the gowin binary?

I still want to work a bit more on the user experience for building nextpnr-himbaechel and device databases, then maybe we can revisit this when nextpnr-himbaechel has replaced the old nextpnr-gowin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me more about the build process of nextpnr-himbaechel and device databases? This is something I care about in YoWASP. It is not feasible to ship one binary that includes every binary and every chipdb due to size constraints (about 100 MB) and I would rather avoid configuring and building nextpnr several times if at all feasible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to provide a unified framework for arches to share one nextpnr binary with a reduced amount of per-arch code, and with a relatively compact deduplicated representation. Gowin will move there first and I will also look at xo2/3 moving to it. Longer term if it works out there will be more migrations and new arches using it.

It is going to have the chipdb binaries as data files (which should in every case be significantly less than 100MB). These will be configurable in the build process and in the next few days there should be a nice discovery process for those so you can pass the name of a device on the command line and it will load the right subarch and chipdb automatically (it'll be a relative path based approach similar to Yosys, I need to look exactly what we do for different arches and build environments there still).

I would guess, once it reaches a point of being end user ready (i.e. the legacy nextpnr-gowin is formally deprecated and then removed in favour of this), we'd have one nextpnr-himbaechel package and then another containing the chip databases per arch.

Copy link
Member

@whitequark whitequark Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Can you make sure every architecture (~ user-visible package) gets its own directory under share/ and nothing outside of this directory is installed? Then I will be able to ship this nicely in YoWASP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that matches what I was planning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic! Then what I will do in yowasp-nextpnr-himbaechel is to pull in all of the chipdbs that yowasp-nextpnr-${ARCH} have registered using the standard Python "entry points" (plugin) mechanism and mount them under share/. Ideally that in itself will be enough.

Do you plan to have e.g. a nextpnr-ecp5 runner that proxies to nextpnr-himbaechel --uarch ecp5? This could be done by snooping argv[0] (checking if it ends with -ecp5, etc) and it would greatly simplify migration for Amaranth.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this could definitely be done as part of the migration path. Although I realise a bit more work around the himbaechel startup and API is needed to support arbitrary command line arguments, which is needed both for backwards compatibility and a friendly CLI in general.

BelId io_bel = gwu.get_io_bel_from_iologic(bel);
if (!ctx->checkBelAvail(io_bel)) {
if (!is_diff_io(io_bel)) {
log_error("Can't place %s at %s because of %s\n", ctx->nameOf(&ci), ctx->nameOfBel(bel),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a short reason to this error message, something like because of a conflict with %s rather than just because of %s?

ci->renamePort(id_Q1, id_Q7);
ci->renamePort(id_Q0, id_Q6);
break;
case ID_IVIDEO:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I think for loops might be tidier here?

// XXX I2 is pin C which must be set to 1 for all ALU modes except MUL
// we use only mode 2 ADDSUB so create and connect this pin
ci->addInput(id_I2);
ci->connectPort(id_I2, ctx->nets[ctx->id("$PACKER_VCC")].get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: ctx->nets.at(ctx->id("$PACKER_VCC")).get() rather than [] - this way it will fail if for some reason the net doesn't exist rather than silently creating a nullptr net and then failing in a harder to debug way

chip.add_node(node)

# VCC and VSS sources in the all tiles
global_nodes.setdefault('GND', []).append(NodeWire(x, y, 'VSS'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a heads up to myself - having a node that fans out to all tiles will cause problems with current deduplication. you don't need to change anything right now, but beware the database will be bigger than expected for this reason (the same is possibly true for global nodes, I will need to see how well the apicula structure compacts)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base sizes are still acceptable :)

bases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really nice - I suspect this is probably only a big problem when talking about the millions of LUTs then ^^

Will put this as more of a slightly background thing compared to the other TODOs on my side.

@yrabbit
Copy link
Contributor Author

yrabbit commented Aug 30, 2023

Apicula release 0.9.0a1 with Himbaechel support is in place, what about merge this one?
What did you decide with HIMBAECHEL_GOWIN_DEVICES?

@yrabbit yrabbit requested a review from gatecat August 30, 2023 22:31
@gatecat
Copy link
Member

gatecat commented Aug 31, 2023

Yes let's fly with this for now. The way in which nextpnr-himbaechel is started (in particular how chipdb paths work) will probably change in a week or two, so just bear that in mind.

@gatecat gatecat merged commit 3e0b982 into YosysHQ:master Aug 31, 2023
8 checks passed
@yrabbit yrabbit deleted the h-gw branch September 2, 2023 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants