Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[analysis] Add a generic powerset lattice #6059

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Conversation

tlively
Copy link
Member

@tlively tlively commented Oct 27, 2023

Add Powerset2<Set> that implements the lattice of sets concretely represented
as Set ordered by subset as well as the full lattice FinitePowerset2<Set>
that is additionally configurted with the universe of set elements so it can
produce a top element, which is the set of everything in the universe.

The "2" in the name serves to temporarily differentiate these new powerset
lattices from the existing powerset lattices, which are less flexible and will
be replaced with the new powerset lattices in a future PR.

Also add a new BitSet utility that implements a set interface matching e.g.
std::set<size_t> or std::unordered_set<size_t> in terms of a
std::vector<bool> bit vector. Use BitSet as the concrete set implementation
used to fuzz Powerset2 and FinitePowerset2.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pull out bitset to its own PR and add dedicated testing for it?

namespace wasm::analysis {

// A powerset lattice whose elements are sets (represented concretely with type
// `Set`) ordered by subset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could call this a "Set Lattice", that is, a lattice over sets and ordered by the natural subset relation? "Powerset" is a specific mathematical operation to generate all subsets but this lattice here does not create a powerset nor does it require that the elements actually be a powerset IIANM.

If so then the one below could be "Powerset Lattice" (without "finite" in the name, which feels odd there to me).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true we never materialize the full powerset, but this lattice does represent a powerset. The lattice itself is the powerset, since its elements are subsets of the set of all T. For that reason, and to be consistent with the literature, we should continue calling this a powerset lattice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but why is the latter called "finite" then, if both are finite?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one is not necessarily finite. There is no bound on the size of the elements it supports.

};

// A powerset lattice initialized with a list of all elements in the universe,
// making it possible to produce a top elements that contains all of them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// making it possible to produce a top elements that contains all of them.
// making it possible to produce a top element that contains all of them.


namespace wasm {

// Represent a set of integers in [0, N) with a bitvector of size N, where a 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Represent a set of integers in [0, N) with a bitvector of size N, where a 1
// Represent a set of integers 0 <= k < N with a bitvector of size N, where a 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the notation [0, N) is used in math to imply a half-open interval of real numbers. It's technically fine either way though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know the interval notation typically implied real numbers. Will update.

Copy link
Member

@kripken kripken Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto size = parent->size();
while (index < size && !(*parent)[index]) {
++index;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this code advances to the first item in the set? A comment might help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll pull this out into a separate PR and add comments, as you suggest.

size_t size() const noexcept {
size_t result = 0;
for (auto it = begin(); it != end(); ++it, ++result) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be possibly simpler and faster to iterate on the underlying std::vector<bool>?

auto* super = (std::vector<bool>*)this;
for (auto x : super) result += x;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(possibly simpler & faster since the current code will have nested loops)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope it would optimize to the same thing, but yes, I can rewrite the code to use the underlying vector directly.

The elements of `Array<L, N>` lattice are arrays of length `N` of elements of
`L`, compared pairwise with each other. This lattice is a concrete
implementation of what would be written L^N with pen and paper.
The vector lattice is nearly identical to the array lattice, except that the
size of the elements is specified at runtime when the lattice object is created
rather than at compile time. The code and tests are largely copy-pasted and
fixed up from the array implementation, but there are a couple differences.
First, initializing vector elements does not need the template magic used to
initialize array elements. Second, the obvious implementations of join and meet
do not work for vectors of bools because they might be specialized to be bit
vectors, so we need workarounds for that particular case.
This lattice combines any number of other lattices into a single lattice whose
elements are tuples of elements of the other lattices. This will be one of the
most important lattices in the analysis framework because it will be used to
combine information about different parts of the program, e.g. locals and the
value stack, into a single lattice.
@tlively tlively changed the base branch from vector-lattice to tuple-lattice October 30, 2023 21:15
@tlively tlively marked this pull request as draft October 30, 2023 21:20
Add `Powerset2<Set>` that implements the lattice of sets concretely represented
as `Set` ordered by subset as well as the full lattice `FinitePowerset2<Set>`
that is additionally configurted with the universe of set elements so it can
produce a top element, which is the set of everything in the universe.

The "2" in the name serves to temporarily differentiate these new powerset
lattices from the existing powerset lattices, which are less flexible and will
be replaced with the new powerset lattices in a future PR.

Also add a new `BitSet` utility that implements a set interface matching e.g.
`std::set<size_t>` or `std::unordered_set<size_t>` in terms of a
`std::vector<bool>` bit vector. Use `BitSet` as the concrete set implementation
used to fuzz `Powerset2` and `FinitePowerset2`.
@tlively tlively marked this pull request as ready for review October 31, 2023 00:35
Base automatically changed from tuple-lattice to main October 31, 2023 04:10
@tlively tlively marked this pull request as draft November 1, 2023 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants