Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On MIR dataflow convergence #990

Open
felix91gr opened this issue Dec 16, 2020 · 9 comments
Open

On MIR dataflow convergence #990

felix91gr opened this issue Dec 16, 2020 · 9 comments
Labels
A-dataflow Area: dataflow analysis A-MIR Area: mid-level intermediate representation (MIR) A-mir-opt Area: MIR optimizations E-hard Difficulty: might require advanced knowledge E-needs-investigation Call for participation: this issue needs further investigation T-compiler Relevant to compiler team

Comments

@felix91gr
Copy link
Contributor

felix91gr commented Dec 16, 2020

So I have a nit to pick with this part of the MIR dataflow guide. In particular:

  • I don't think it's quite right, and
  • It might suffer from slight ambiguity.

Lemme explain. This is the text:


Convergence

Your analysis must converge to "fixpoint", otherwise it will run forever. Converging to fixpoint is just another way of saying "reaching equilibrium". In order to reach equilibrium, your analysis must obey some laws. One of the laws it must obey is that the bottom value1 joined with some other value equals the second value. Or, as an equation:

bottom join x = x

Another law is that your analysis must have a "top value" such that

top join x = top

Having a top value ensures that your semilattice has a finite height, and the law state above ensures that once the dataflow state reaches top, it will no longer change (the fixpoint will be top).


Here are my issues:

Ambiguity

Your analysis must converge to "fixpoint", otherwise it will run forever.

Isn't this ambiguous? As I understand it, there are many ways to look for an optimum valid point in the lattice, and not all of those require fixpoint.

I also feel weird about the phrase "your analysis must converge to 'fixpoint'". Isn't the framework the one who finds a fixpoint? But to me it feels like we're asking the implementer to find it.

Stuff that seems wrong to me

Bottom's reason

One of the laws it must obey is that the bottom value[^bottom-purpose] joined with some other value equals the second value. Or, as an equation:

> *bottom* join *x* = *x*

As I understand it, the bottom value is important to give the process an initial state, but it's otherwise not really necessary for convergence.

Top's reason

Another law is that your analysis must have a "top value" such that

> *top* join *x* = *top*

Having a top value ensures that your semilattice has a finite height, and the law state above ensures that once the dataflow state reaches top, it will no longer change (the fixpoint will be top).

This on the other hand, I'm certain is not true. The top value ensures that the semilattice has a "common upper bound" for any pair of elements, but it does not force convergence.

I'm going to try to illustrate this with the following example, please let me know if something's not clear:

The interval lattice

Let's define the Interval Lattice as follows:

  • Set: let I be all pairs of integers [a, b] where a <= b. A pair like that represents the set of integers where for all x in the set, a <= x and x <= b. This is not enough to have a lattice, so let's expand I with the following intervals:
    • For every integer a, we will add the element [-infinity, a] and the element [a, infinity] to I, so that unbounded ranges can be expressed. In set terms, this means that the elements contained in [a, infinity] are all integers greater than or equal to a.
    • We will add the element [-infinity, infinity] to I so that an unrestricted range can be expressed. This range represents the entirety of the Integer set.
    • We will add the element [] so that an empty range can be expressed. This range represents the Empty set.
  • Order: we will use the partial order A <= B to mean exactly "A is fully contained in B". This is a partial order, so we're ready to see our lattice.
  • Join: we will use the operation A join B to mean "A intersected with B".

It can be proven that all of this constitutes a valid lattice.

Here is an illustration of it. It uses bottom as top, but orientation is a convention in lattices so these are equivalent to how we're using top and bottom in our framework.

image

In our Interval Lattice, we use [-infinity, infinity] to mean bottom and [] to mean top. However, unlike many of the lattices that are likely in use in the compiler today, this lattice has infinite height. Even though it has a top and bottom values.

Here are more details into this lattice and how to work with it to actually get convergence. It's part of Static Program Analysis, one of the resources quoted in the docs :3

So what's the reason for top, then

Top is needed to have a complete lattice. This just helps prove certain properties. But it does not help convergence. We need something stronger than that, otherwise there will be lattices like the one above which will not converge even though our requisites are being satisified.

Convergence, then

The main property that's used for proving convergence for fixpoint algorithms over common Lattices is the Ascending Chain Condition (or Descending CC, depending on the convention. In our convention, we start at the bottom element, so we use the ACC). It basically states that any strictly ascending chain of elements in the lattice must be finite, or in other words, that any ascending chain of elements in the lattice must reach an n-th element, from which it doesn't go up anymore.

This is definitely strong enough for our fixpoint context, since the fixpoint algorithms will reach convergence, at most at the end of the ACC that they are climbing.

This also rejects the lattice I described above, and for good reason: it cannot be directly searched with fixpoint algorithms; it has the capability of never converging to a value. But for lattices like those, approximations can be done that let fixpoint algorithms converge, and maybe that can be pointed out in the docs if / when the framework is capable of working with such approximations.

Footnotes

  1. The bottom value's primary purpose is as the initial dataflow
    state. Each basic block's entry state is initialized to bottom before the
    analysis starts.

@camelid
Copy link
Member

camelid commented Dec 16, 2020

cc @oli-obk since you know mir-opt

@oli-obk
Copy link
Contributor

oli-obk commented Dec 26, 2020

infinite height.

Since we are working with u128 at maximum, infinity on the right side is actually u128::max_value() (let's ignore i128 for now, all reasoning works the same for it). Infinity on the left side is just 0. So the height of the lattice is not infinite.

We should probably still treat that lattice as infinitely sized, so we don't end up having an algorithm walk a loop until overflow or something weird (that would require infinite time for all practical purposes).

In our Interval Lattice, we use [-infinity, infinity] to mean bottom and [] to mean top.

your graph shows exactly the opposite afaict.

I cannot tell whether the terminology is wrong, imprecise or right. The original author is currently not available for asking questions. Anything that expands the docs here and gives more explanations is better though, and you bring up some valid points wrt termination.

@felix91gr
Copy link
Contributor Author

felix91gr commented Dec 26, 2020

In our Interval Lattice, we use [-infinity, infinity] to mean bottom and [] to mean top.

your graph shows exactly the opposite afaict.

Of course. I even mentioned it would! ^^

Here is an illustration of it. It uses bottom as top, but orientation is a convention in lattices so these are equivalent to how we're using top and bottom in our framework.

@felix91gr
Copy link
Contributor Author

felix91gr commented Dec 26, 2020

So the height of the lattice is not infinite.

True, but that is just an implementation detail. If we wanted to, or needed to, we could expand our engine to effectively reason about arbitrarily high numbers (sorta like Maple and Mathematica do their symbolic algebra). On the other hand, most of the implications of an infinitely tall lattice apply for the current maximum ranges, but you've covered that already 🙂

I wonder... if there is an analysis that would suffer from not having arbitrarily high values at its disposal. I feel that there might be edge cases that do, but that's just my gut talking. I'd need to research it more to say for sure if it's actually needed.

EDIT: I think the Interval Lattice would actually need a proof of the reasoning still being valid on the collapsed (ie u128.MAX-height) lattice. This seems very doable, and I think there is literature already on this topic. I know that approximations are a common practice to treat the convergence issues of dataflow analyses over infinitely tall lattices, at least.

EDIT2: some of these approximations can be seen from page 82 onwards ("Widening and Narrowing") in the static program analysis book.

@felix91gr
Copy link
Contributor Author

felix91gr commented Dec 26, 2020

The original author is currently not available for asking questions.

It's just bad luck I guess 😅 but there's nothing we can do if they're not available.

Do you know if we have another expert on this topic in the group, that we could ask some?

@oli-obk
Copy link
Contributor

oli-obk commented Dec 26, 2020

Do you know if we have another expert on this topic in the group, that we could ask some?

I do not

@camelid
Copy link
Member

camelid commented Dec 26, 2020

The original author is currently not available for asking questions.

Who's the original author? I'm pretty sure I'm the one who wrote the convergence section (unless we're talking about something else). However, I'm not sure if I can help you since I'm still trying to understand data-flow analysis myself!

@oli-obk
Copy link
Contributor

oli-obk commented Dec 26, 2020

Argh, sorry, words are hard. I meant the original author of the dataflow system in the compiler. ecstatic-morse wrote that and improved it over the last year. I can use dataflow at all since their rework, but I have no understanding of it (read theoretical background) beyond the implementation in the compiler.

@camelid
Copy link
Member

camelid commented Dec 27, 2020

Argh, sorry, words are hard. I meant the original author of the dataflow system in the compiler. ecstatic-morse wrote that and improved it over the last year. I can use dataflow at all since their rework, but I have no understanding of it (read theoretical background) beyond the implementation in the compiler.

Makes sense, thanks for explaining :)

@jieyouxu jieyouxu added E-hard Difficulty: might require advanced knowledge A-MIR Area: mid-level intermediate representation (MIR) T-compiler Relevant to compiler team A-dataflow Area: dataflow analysis A-mir-opt Area: MIR optimizations E-needs-investigation Call for participation: this issue needs further investigation labels Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dataflow Area: dataflow analysis A-MIR Area: mid-level intermediate representation (MIR) A-mir-opt Area: MIR optimizations E-hard Difficulty: might require advanced knowledge E-needs-investigation Call for participation: this issue needs further investigation T-compiler Relevant to compiler team
Projects
None yet
Development

No branches or pull requests

4 participants