-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gowin: Himbaechel arch #1184
gowin: Himbaechel arch #1184
Conversation
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- wires, nodes and whites are generated from bases - apicula; - roting of SN and EW bidirectional wires is supported; - supports "wrapping" the wires at the edges of the chip; - LUT1-4 and two types of DFF(R) are supported; - simple IO is supported. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- VSS and VCC sources in each cell are used; - constant LUT inputs are disabled; - putting the class declaration into a header file. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Himbaechel-gowin has learned how to place DFFs of all types by tracking the compatibility of CLK, CE and LSR inputs, as well as placing mutually compatible flip-flops in adjacent slices. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Run clang-format Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
himbaechel/uarch/gowin/gowin.cc
Outdated
assign_cell_info(); | ||
void GowinImpl::prePlace() | ||
{ | ||
ctx->cells.at(ctx->id("leds_OBUF_O"))->setAttr(ctx->id("BEL"), std::string("X0Y14/IOBA")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is test code, but testing with multiple designs would probably be easier manually instantiating the IO buffers and then setting the bel attribute in the Verilog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know :) will be fixed
himbaechel/uarch/gowin/gowin.cc
Outdated
|
||
// Constrain directly connected LUTs and FFs together to use dedicated resources | ||
int lutffs = h.constrain_cell_pairs( | ||
pool<CellTypePort>{{id_LUT1, id_F}, {id_LUT2, id_F}, {id_LUT3, id_F}, {id_LUT4, id_F}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a style point, probably make these pool<CellTypePort>
variables rather than having them inline as long initialiser lists in the function parameters.
Some tiny comments, but overall I'm really happy with the direction this is going in! |
I'll need to get to pins, banks, and packages soon. And would really like to know the name of the loaded chipdb sometime |
|
Ok. I missed that. |
- MUX2_LUT5, MUX2_LUT6, MUX2_LUT7 and MUX2_LUT8 support; - storing a common class of files in extra_data; - misc fixes. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
I would suggest planning to split |
- Added support for ALU running in "2" ADDSUB mode, the mode that yosys generates for gowin; - Supports specifying an arbitrary input carry as well as passing the output carry to logic; - A small restructuring of the source files. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- RAM16SDP1, RAM16SDP2 and RAM16SDP4 support; - Reading in these primitives is asynchronous, but we have taken measures so that DFF Bels remain unoccupied and they can be used to implement synchronous reading. - misc fixes. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Shamelessly adapted gatecat's router. Very early version, not yet puzzled with recognizing clock sources and controlling the type of wires involved. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The pin data is only being populated so far, but not used. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- minor fixes for pinout saving; - CST parser taken from generic-based apicula; - $nextpnr IOB detachment is copied here because it is necessary to copy attributes from deleted bels. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
On GW2A chips, the global set/reset is in its own cell. Also corrections to ALU generation. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Use information about pin functions in the clock router. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- The global router is modified to work out the routing of PLL outputs and inputs; - Added API function to change wire type after its creation - there was a need to unify all wires included in the node at the stage of node creation, when all wires have already been created. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
To implement unusual IOs that have a dynamically changing configuration it is convenient to store the switching method in the additional chip data. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Add processing IO located on the sides of some chips. These are IOBUF, which are converted into IBUF and OBUF not by fuses, but by signaling to OE. Also added the creation of a Global Set / Reset for all chips, instead of a list of tile types, information from the apicula database is used, and minor fixes. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
- experiment with notifyBelChange as an auxiliary cells reservation mechanism; - since HCLK pips depend on the coordinates, and not on the tile type, the tile type is copied if necessary; - information about supported types of differential IO primitives has been added to the extra information of the chip; Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
As well as the implementation of all OSC primitives. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
A single mechanism for creating a new type of tile if special functions are found in the chip database that depend on the coordinates of the tile. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Information about what function (main or auxiliary) the cell performs in these primitives is transmitted through the tile's extra data. And this also allows us to remove the calculation of the coordinates of the auxiliary cell on the go. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
And also fix the clock router to allow (with a warning) non-dedicated routing in case of false detection of clock wires. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
himbaechel/uarch/gowin/gowin.cc
Outdated
} | ||
if (adj_ff) { | ||
// DFFs must be same type or compatible | ||
if (ff->type != adj_ff->type && ((ff->type.in(id_DFFS) && !adj_ff->type.in(id_DFFR)) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it would be neater to put this in a ffs_compatible
inline function or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it would be neater to put this in a
ffs_compatible
inline function or similar?
This is really a one-time piece ... well, I'll move it tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it would be neater to put this in a
ffs_compatible
inline function or similar?
Done
- OSER4 can be located in neighboring IOs; - PLLVR also needs to rename the inputs. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
OSER16/IDES16 placement issue reports now indicate which location is having trouble. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
In these chips, the midline IOs are still simple, but are no longer just IOBUF - that is, unlike the GW1N-1 IBUF and OBUF are not obtained by applying a signal to the OEN input. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Using extra cell functions to mark disabled units using the PLL example. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Install the Himbaechel gowin chipdb .bin files to /usr/local/nextpnr/himbaehel/gowin Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very good! Most of the comments are really small changes. I'm happy to merge them straight away, then I think there are a few changes on my side next:
- timing support in himbächel
- better user experience for building and finding chipdbs at runtime (similar to how the yosys
share
directory finding approach works) - improved ux for deduplicated global constants
|
||
find_package(Python3 3.5 REQUIRED COMPONENTS Interpreter) | ||
set(ALL_HIMBAECHEL_GOWIN_DEVICES GW1N-1 GW1NZ-1 GW1NS-2 GW1N-4 GW1N-9 GW1N-9C GW1NS-4 GW2A-18 GW2A-18C) | ||
set(HIMBAECHEL_GOWIN_DEVICES ${ALL_HIMBAECHEL_GOWIN_DEVICES} CACHE STRING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have this default to none for now so building nextpnr himbaechel doesn't require the gowin binary?
I still want to work a bit more on the user experience for building nextpnr-himbaechel and device databases, then maybe we can revisit this when nextpnr-himbaechel has replaced the old nextpnr-gowin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you tell me more about the build process of nextpnr-himbaechel and device databases? This is something I care about in YoWASP. It is not feasible to ship one binary that includes every binary and every chipdb due to size constraints (about 100 MB) and I would rather avoid configuring and building nextpnr several times if at all feasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to provide a unified framework for arches to share one nextpnr binary with a reduced amount of per-arch code, and with a relatively compact deduplicated representation. Gowin will move there first and I will also look at xo2/3 moving to it. Longer term if it works out there will be more migrations and new arches using it.
It is going to have the chipdb binaries as data files (which should in every case be significantly less than 100MB). These will be configurable in the build process and in the next few days there should be a nice discovery process for those so you can pass the name of a device on the command line and it will load the right subarch and chipdb automatically (it'll be a relative path based approach similar to Yosys, I need to look exactly what we do for different arches and build environments there still).
I would guess, once it reaches a point of being end user ready (i.e. the legacy nextpnr-gowin is formally deprecated and then removed in favour of this), we'd have one nextpnr-himbaechel package and then another containing the chip databases per arch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Can you make sure every architecture (~ user-visible package) gets its own directory under share/
and nothing outside of this directory is installed? Then I will be able to ship this nicely in YoWASP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that matches what I was planning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic! Then what I will do in yowasp-nextpnr-himbaechel
is to pull in all of the chipdbs that yowasp-nextpnr-${ARCH}
have registered using the standard Python "entry points" (plugin) mechanism and mount them under share/
. Ideally that in itself will be enough.
Do you plan to have e.g. a nextpnr-ecp5
runner that proxies to nextpnr-himbaechel --uarch ecp5
? This could be done by snooping argv[0]
(checking if it ends with -ecp5
, etc) and it would greatly simplify migration for Amaranth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this could definitely be done as part of the migration path. Although I realise a bit more work around the himbaechel startup and API is needed to support arbitrary command line arguments, which is needed both for backwards compatibility and a friendly CLI in general.
himbaechel/uarch/gowin/pack.cc
Outdated
BelId io_bel = gwu.get_io_bel_from_iologic(bel); | ||
if (!ctx->checkBelAvail(io_bel)) { | ||
if (!is_diff_io(io_bel)) { | ||
log_error("Can't place %s at %s because of %s\n", ctx->nameOf(&ci), ctx->nameOfBel(bel), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a short reason to this error message, something like because of a conflict with %s
rather than just because of %s
?
ci->renamePort(id_Q1, id_Q7); | ||
ci->renamePort(id_Q0, id_Q6); | ||
break; | ||
case ID_IVIDEO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I think for
loops might be tidier here?
himbaechel/uarch/gowin/pack.cc
Outdated
// XXX I2 is pin C which must be set to 1 for all ALU modes except MUL | ||
// we use only mode 2 ADDSUB so create and connect this pin | ||
ci->addInput(id_I2); | ||
ci->connectPort(id_I2, ctx->nets[ctx->id("$PACKER_VCC")].get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: ctx->nets.at(ctx->id("$PACKER_VCC")).get()
rather than []
- this way it will fail if for some reason the net doesn't exist rather than silently creating a nullptr
net and then failing in a harder to debug way
chip.add_node(node) | ||
|
||
# VCC and VSS sources in the all tiles | ||
global_nodes.setdefault('GND', []).append(NodeWire(x, y, 'VSS')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a heads up to myself - having a node that fans out to all tiles will cause problems with current deduplication. you don't need to change anything right now, but beware the database will be bigger than expected for this reason (the same is possibly true for global nodes, I will need to see how well the apicula structure compacts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really nice - I suspect this is probably only a big problem when talking about the millions of LUTs then ^^
Will put this as more of a slightly background thing compared to the other TODOs on my side.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Apicula release 0.9.0a1 with Himbaechel support is in place, what about merge this one? |
Yes let's fly with this for now. The way in which nextpnr-himbaechel is started (in particular how chipdb paths work) will probably change in a week or two, so just bear that in mind. |
gowin P&R implementation as Himbaechel sub-architecture.