Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ungrouping logic #2552

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

openroad-robot
Copy link
Contributor

Get area estimates from Yosys for unsynthesized modules and use those to make ungrouping decisions, replacing the existing solution of doing synthesis in two passes (first fully hierarchical to get area numbers, then with selective flattening)

Signed-off-by: Martin Povišer <povik@cutebit.org>
Signed-off-by: Martin Povišer <povik@cutebit.org>
Signed-off-by: Martin Povišer <povik@cutebit.org>
@@ -1,6 +1,6 @@
[submodule "tools/yosys"]
path = tools/yosys
url = ../../The-OpenROAD-Project/yosys.git
url = https://github.com/povik/yosys.git
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this is blocked on gettting a new Yosys release with PRs 4682, 4678, 4706

For the time being I switched to a personal branch which cherry picks those PRs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will #2550 suffice?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no

@oharboe
Copy link
Collaborator

oharboe commented Nov 6, 2024

Not to be a party pooper, but is this now moot?

With the latest fixes to macro placement, can we find a design with autoplaced macros that is better off with SYNTH_HIERARCHICAL=1 than 0?

If macro placement is equally good with SYNTH_HIERARCHICAL=0, what does turning it on help?

Running time of synthesis?

megaboom is a happy case for SYNTH_HIERARCHICAL=0, it reduces number of instances by 50%

@povik
Copy link

povik commented Nov 6, 2024

@maliberty please comment

@oharboe
Copy link
Collaborator

oharboe commented Nov 6, 2024

@maliberty please comment

Just to be clear: I think this is a MUCH better solution than the two phase synthesis to implement a keep/flatten policy and I think we should merge it.

I am just wondering what the problem we have today with quality of results and running time is.

@povik
Copy link

povik commented Nov 6, 2024

Sure, my understanding from Matt is SYNTH_HIERARCHY=1 is still a very important mode for good macro placement, in addition to being useful for applying constraints. Synthesis runtime is less of a concern.

@maliberty
Copy link
Member

It enables constraints on hierarchical ports once we have that enabled in sta. It also allows for GUI visualization. As for 50% saving, I think that is the result of using a poor size setting.

@oharboe
Copy link
Collaborator

oharboe commented Nov 6, 2024

It enables constraints on hierarchical ports once we have that enabled in sta.

Silly questions:

what is this used for and is there an example in ORFS?

Also, now that flattened register names are stable(it seems to me), why is this unattainable together with flattened synthesis?

It also allows for GUI visualization.

Agreed, looks nice, but I am curious how this helps quality of results?

As for 50% saving, I think that is the result of using a poor size setting.

For now, I think I have dispelled such hopes for megaboom.

I did a MAX_UNGROUP_SIZE sweep: maximum saving and WNS only materializes when flattening everything and using macro placement from hierarchical synthesis:

The-OpenROAD-Project/megaboom#191

I think doing a fresh sweep of parameters might as well await fixes in global placement to problems that @jeffng-or identified as well as the macro placement fixes.

Even so, my hopes are not high that tinkering with MAX_UNGROUP_SIZE will be meaningful(keep hierarchical synthesis and compete on quality of results).

Possibly a manual flattening/ungroup/keep policy could do the trick. A manual flattening policy together with a manual .sdc could flatten everything that is not needed by .sdc and it would be single pass(possibly after a first pass to get some statistics). Possibly, depending on usecase, trivial to articulate by the time one has written the .sdc and as stable as the
.sdc once one works on RTL as much as one does at this stage of development.

@maliberty
Copy link
Member

what is this used for and is there an example in ORFS?

There is no example today as we don't support it. It is common to write SDC in terms of RTL ports as they are stable across synthesis and other optimizations. Think false-paths as an example. It is a regular request as proprietary tools do support this so people have SDCs requiring such.

@maliberty
Copy link
Member

For now, I think I have dispelled such hopes for megaboom.

Its only one test case and there is a lot of work on macro placement still to do. Even so you show a small amount of ungrouping can have a small effect (<50%).
image

@oharboe
Copy link
Collaborator

oharboe commented Nov 6, 2024

what is this used for and is there an example in ORFS?

There is no example today as we don't support it. It is common to write SDC in terms of RTL ports as they are stable across synthesis and other optimizations. Think false-paths as an example. It is a regular request as proprietary tools do support this so people have SDCs requiring such.

ORFS mock-cpu has(more like is) a clock crossing async fifo that could use this. the .sdc file is missing some sophistication in the articulation of the false paths for the fifo clock crossing.

Though I think it could be articulated today using flip flops and flattening instead of ports, though at the cost of having a less stable harder to articulate .sdc file(uses post synthesis names. names would vary between flattening and not, though that applies to ports of instance too).

Right?

Anyway: a specific test case in ORFS as it becomes possible will be clarifying.

@maliberty
Copy link
Member

Not all constraints are easily formulated in terms of flops (one pin could lead to an arbitrary number of flops). Requiring people to rewrite there SDC wins few friends.

@oharboe
Copy link
Collaborator

oharboe commented Nov 7, 2024

Not all constraints are easily formulated in terms of flops (one pin could lead to an arbitrary number of flops). Requiring people to rewrite there SDC wins few friends.

I see. What do you think about allowing a manual policy?

If you are writing an .sdc file, you know what modules you care to keep?

@maliberty
Copy link
Member

I think a manual policy is fine. Ideally yosys would read the sdc and automatically keep necessary modules but that is too far off for now.

@oharboe
Copy link
Collaborator

oharboe commented Nov 7, 2024

I think a manual policy is fine. Ideally yosys would read the sdc and automatically keep necessary modules but that is too far off for now.

Sounds good. I think working through a few examples in ORFS will be illuminating on the use case and that the implementation should be trivial once the use cases are clear.

@oharboe
Copy link
Collaborator

oharboe commented Nov 7, 2024

Just to be clear: the best megaboom results require synthesis three times today.

The flow below is automated in megaboom upon changes to RTL or updates to ORFS/OpenROAD/yosys:

  • synthesis twice to get best macro placement, which is written out to a file via write_macro_placement
  • flattened synthesis after which the macro placement from above is used

It seems to me that this flow should give the best quality of results.

However, the WNS is just a hair(2%) better with macro placement based on hierarchical synthesis vs flattened synthesis. 2% is probably below the the "inconclusive" threshold.

Again: once global and macro placement fixes are in, I will do a new sweep in megaboom.

The placement density of megaboom at 0.25(due to macro and global placement problems) is also so low that I am loath to conclude much. We could be looking at pathological routing based on added wirelengths alone that is mudding the picture.

Also, megaboom is documented to require retiming. without which synthesis creates pathological logic depths, which now clearly appear in WNS: combinational multiplication with three pipeline stages which is rewritten to pipelined multiplier by commercial tools in the 28nm 1000ps tapeout.

@oharboe
Copy link
Collaborator

oharboe commented Nov 18, 2024

Just FYI, macro placement on megaboom with flattened synthesis gives much worse results than hierarchical synthesis and macro placement The-OpenROAD-Project/megaboom#206 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants