Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove uniqueness invariant from MastForestBuilder #1473

Merged
merged 57 commits into from
Sep 4, 2024

Conversation

plafer
Copy link
Contributor

@plafer plafer commented Aug 27, 2024

We now use MastNodeId instead of MAST root as an identifier for procedures.

This is because 2 procedures can have identical MAST roots, but different decorators (resulting in different MastNodeIds).

@plafer plafer requested a review from bobbinth September 3, 2024 15:59
@plafer
Copy link
Contributor Author

plafer commented Sep 3, 2024

@bobbinth ready for re-review. Only thing not done yet is adding the warning when multiple procedures have the same MAST root (depending on if/how we want to do it)

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks good! I left a few more comments/questions inline.

let mast_root = {
let proc_body_id =
self.resolve_target(InvokeKind::ProcRef, callee, proc_ctx, mast_forest_builder)?;
mast_forest_builder.get_mast_node(proc_body_id).unwrap().digest()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we add a comment here explaining why unwrap() is OK?

Copy link
Contributor Author

@plafer plafer Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment but honestly the unwrap() is true by construction: MastNodeId is only created by a MastForest, so all MastNodeIds are associated with a MastNode in a MastForest (unless you use it with the wrong mast forest or whatnot, but that's an internal error where I believe a panic! is appropriate).

We used to implement Index on MastForestBuilder for this purpose, and I'm not sure why we removed it. We had added a MastForest::get_mast_node() to be used in deserialization code, but it was not intended to be used anywhere else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was surprised we removed Index as well, since I thought it was more-common-than-not that we were accessing node data from the forest, by id, with an id that we know a priori is in the forest, such as this case. Might be worth adding it back if the thinking was that it wasn't being used.

assembly/src/assembler/instruction/procedures.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
@plafer plafer requested a review from bobbinth September 3, 2024 22:32
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left one small comment inline.

Also, would be good to get @bitwalker to look through this as well.

Comment on lines 538 to 540
// insert external node into the MAST forest for this procedure; if a procedure
// with the same MAST rood had been previously added to the builder, this will
// with the same MAST root had been previously added to the builder, this will
// have no effect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment should go above line 529 - no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually removed this comment (in #1466) since it's now redundant with the Assembler::resolve_target() docs

/// [`MastNode`]s have equal [`MastNodeId`]s.
fn ensure_node(&mut self, node: MastNode) -> Result<MastNodeId, AssemblyError> {
let node_digest = node.digest();
/// Note adding the same [`MastNode`] twice will result in two different [`MastNodeId`]s being
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true if the MAST nodes are exactly identical right? i.e. same hash, same decorators? I think the only condition where duplicate nodes are required/expected, are when the MAST root is the same, but the decorators differ, otherwise we can collapse duplicates into a single node as before. That way we still largely avoid duplicating a bunch of identical nodes, and only duplicate on an as-needed basis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this is what the implementation does, but forgot to fix the docstring.

Fixed in #1466

/// calls to library procedures will be compiled down to a [`vm_core::mast::ExternalNode`] (i.e.
/// a reference to the procedure's MAST root). This means that when executing a program compiled
/// against a library, the processor will not be able to differentiate procedures with the same
/// MAST root but different decorators. Hence, it is not recommended to export two procedures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I think this is unlikely to ever be an explicit choice, except in the specific case of APIs being renamed/moved between modules, where during a deprecation period, the same procedure is exported under both its old and new paths. This wouldn't have any practical effect on debugging and such, since in both cases the procedure hasn't changed.

It'll be interesting to see how frequently this actually occurs in practice. I do think it would be a good idea for us to have a pedantic warning mode, or some other tooling that would let us analyze things like this on an opt-in basis, but I suspect that in general we're unlikely to see scenarios in which this causes confusion except for in large programs.

Just to be clear, I think this is a good note to have, but I would maybe suggest illustrating the circumstances in which this can occur, and the expected worst-case scenarios. Otherwise it sounds a bit dire, but also isn't clear to the reader whether or not this is something they need to be actively worried about or not, or how to even vet/prevent this from happening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added to the comment, and created #1479

// ================================================================================================

/// Generic container for a choice between two types.
#[derive(Debug, Clone, Copy)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like this is used, so I would suggest removing it for now. I would pull in the either crate in the future if we determine having an Either type would be useful, as it provides a lot of conveniences you'd expect on a monadic type like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Note: I defined it instead of pulling in the either crate, since @bobbinth typically pushes for us to limit the amount of dependencies we bring in. We can decide on whether we implement it ourselves or pull in the either crate next time we need it.

Copy link
Contributor

@bitwalker bitwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few small notes, but overall this looks good to me! Nice work!

@bitwalker bitwalker merged commit f7d179e into next Sep 4, 2024
9 checks passed
@bitwalker bitwalker deleted the plafer-mast-forest-builder-remove-uniqueness branch September 4, 2024 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants