Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_convert_fragment() has polynomial time complexity as signals are added to design #711

Closed
antonblanchard opened this issue Jul 18, 2022 · 5 comments

Comments

@antonblanchard
Copy link
Contributor

While creating a large gate level multiplier in amaranth I noticed that it took a long time to write out the Verilog.

A simple test case:

#!/usr/bin/python3

import sys

from amaranth import Elaboratable, Module, Signal
from amaranth.back import verilog

class Test(Elaboratable):
    def __init__(self, nr_sigs=1000):
        self.i = Signal()
        self.o = Signal()
        self.nr_sigs=nr_sigs

    def elaborate(self, platform):
        m = Module()

        s = Signal()
        m.d.comb += s.eq(self.i)

        for i in range(self.nr_sigs):
            s_new = Signal(name="s_%i" % i)
            m.d.comb += s_new.eq(s)
            s = s_new

        m.d.comb += self.o.eq(s)

        return m

nr_sigs=int(sys.argv[1])
test = Test(nr_sigs=nr_sigs)
print(verilog.convert(test, ports=[test.i, test.o], strip_internal_attrs=True))

Things start going bad at 1000 signals.

Almost all the time is in the code that groups statements - for each signal we then iterate over each statement. I don't understand everything we are doing here (eg why we have to group signals vs just putting each one in their own process), but it seems we could make a single pass through the signals and statements to form them into the groups we need.

        lhs_grouper = xfrm.LHSGroupAnalyzer()
        lhs_grouper.on_statements(fragment.statements)

        for group, group_signals in lhs_grouper.groups().items():
            lhs_group_filter = xfrm.LHSGroupFilter(group_signals)
            group_stmts = lhs_group_filter(fragment.statements)

            with module.process(name="$group_{}".format(group)) as process:
                with process.case() as case:
                    # For every signal in comb domain, assign \sig$next to the reset value.
                    # For every signal in sync domains, assign \sig$next to the current
                    # value (\sig).
                    for domain, signal in fragment.iter_drivers():
                        if signal not in group_signals:
                            continue
                        if domain is None:
                            prev_value = ast.Const(signal.reset, signal.width)
                        else:
                            prev_value = signal
                        case.assign(lhs_compiler(signal), rhs_compiler(prev_value))

                    # Convert statements into decision trees.
                    stmt_compiler._case = case
                    stmt_compiler._has_rhs = False
                    stmt_compiler._wrap_assign = False
                    stmt_compiler(group_stmts)
@whitequark
Copy link
Member

I'll take a look, thanks!

@whitequark
Copy link
Member

Could you prepare a PR with this change, please?

@whitequark whitequark added this to the 0.4 milestone Jan 31, 2023
@whitequark
Copy link
Member

Could you prepare a PR with this change, please?

Sorry, I misunderstood. For some reason I thought you posted a patch in the PR, not just a reference to the code.

@whitequark
Copy link
Member

eg why we have to group signals vs just putting each one in their own process

We have to group signals because the target of an assignment can be a concatenation. Consider something like:

a = Signal(16)
b = Signal(16)
c = Signal(16)
d = Signal(16)
m.d.comb += Cat(a, b).eq(c * d)

Unless you stuff the two of them into the same process, you have to duplicate the right-hand side, which can get really unwieldy for certain kinds of code. (We already duplicate

The reason we can't just stuff everything into the same process is that you can have something like:

m.d.comb += a.eq(b)
m.d.comb += b.eq(c)

which confuses Verilog simulators if you translate it to:

always @(*) begin
  a = b;
  b = c;
end

I think it should work fine if you do it like:

always @(*) begin
  a$next = b;
  b$next = c;
end
always @(*) begin
  a = a$next;
  b = b$next;
end

but that's ugly.

The right approach here is to convert RTL to netlist on Amaranth side and then convert netlist back to readable-ish Verilog (this is feasible), but it'll take a bit of time to do that.

@whitequark whitequark removed this from the 0.4 milestone Feb 16, 2023
@whitequark whitequark added this to the 0.5 milestone Feb 13, 2024
@whitequark
Copy link
Member

This is fixed by the new IR in the main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants