Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding type annotations to intermediate representation #320

Merged
merged 46 commits into from
Jan 25, 2024

Conversation

yihozhang
Copy link
Collaborator

@yihozhang yihozhang commented Dec 8, 2023

This PR implements some important refactoring. At a high level, this PR gets rid of many legacy code and use a set of generic IRs which allows greater code sharing (thus reduced codebase size).

  • Fully remove the old NormExpr and NormAction, use the new CoreIR instead.
  • Generalizes front-end IR (e.g., Rule) and backend-IR (e.g., CoreRule) to be parametric over the head and leaf symbols.
  • Add type resolution so that typechecking returns a type-annotated expression.
  • Replace the type checker for actions to use the constraint-based one.
  • Refactor the backend Action->Program compiler so that it no longer does unnecessary type checking and works with CoreAction instead of Action.
  • desugar and resugar flags are replaced with a --show desugared-egglog flag, which prints an egglog program produced by the desugaring phase.

The current compilation workflow is

  1. Command -[desugar]-> NCommand: does some lightweight desugaring, e.g., turns a function into a relation
  2. Rule in NCommand -[lower]-> CoreRule: lowering to CoreRule for typechecking
  3. (Rule, CoreRule) -[typecheck]-> ResolvedRule: ResolvedRule is a Rule with all its function symbols and variables type annotated.
  4. Other planned transformations (e.g. proofs) over ResolvedRule
  5. ResolvedRule -[lower]-> ResolvedCoreRule
  6. ResolvedCoreRule -[compilation]-> Program

Fixes #80, fixes #113, fixes #196.

yihozhang and others added 30 commits November 21, 2023 00:14
However, the type constraint part is almost done
2. expr_to_norm_actions -> expr.to_norm_actions
Some remaining work in main.rs, test.rs, to_resolved, and others.
* Change the specification of accept
* Use indexmaps more to enforce a consistent variable ordering for Load::Subst
* Refactor `compile_exprs`
Fix clippy errors and extraction in yihong's typechecking branch
src/ast/desugar.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@ezrosent ezrosent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments (this is the first round. I think there's more to do but this may be a good place to start)


#[derive(Debug, Clone, Eq, PartialEq, Hash)]
pub enum NCommand {
pub enum GenericNCommand<Head, Leaf, Ann> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the context is fresh, let's add a doc comment explaining what a GenericNCommand is.

  • Also, offline discussed that different names would be better here.

  • optional: Let's think about naming things Generic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -40,232 +43,110 @@ impl Display for Id {
}
}

pub type CommandId = usize;
pub type NCommand = GenericNCommand<Symbol, Symbol, ()>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More descriptive name and/or doc

(SourceCommand? UntypedCommand? Happy to have the one that the user sees just be Command or similar, but then we'll want good docs explaining the relationship that other commands (e.g. ResolvedCommand have to Command).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Same above: I will make a separate PR to use the DCommand/RawDCommand/ResolvedDCommand scheme.

src/core.rs Outdated
@@ -0,0 +1,755 @@
/// This file implements the core IR of the language, which is called CoreRule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comments for a module should have //! as their prefix.

))
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: consider shorter names for the return types in to_query:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a MappedExpr type (with brief documentation) as a shorthand

src/core.rs Outdated
}

#[derive(Debug, Clone)]
pub struct GenericCoreRule<BodyF, HeadF, Leaf> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a doc comment here (perhaps something that sheds light on the F at the end).

}
var
});
pub fn map_def_use(&self, fvar: &mut impl FnMut(&Leaf, bool) -> Leaf) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits:

  • Why is this pub and not pub(crate) ?
  • Document this method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually funny that every method in this impl are dead code if not pub. I left a note here. I think map_def_use is referred to in term.rs so it might be needed in the near future. All these functions are either from the very original codebase or from the NormFact refactor. I'm also happy to just delete all of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can delete them then I think we should :)

Up to you though

rule: &CoreRule,
typeinfo: &TypeInfo,
) -> Result<(), TypeError> {
let CoreRule { head, body } = rule;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: you can move this to the function parameter, I think:

CoreRule { head, body } : &CoreRule

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

}
}

// Different from `from_resolution`, this function only considers function types and not primitives.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think this comment can be a /// comment
  • Why is it pub ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; I changed them to pub(crate)

types: &[ArcSort],
typeinfo: &TypeInfo,
) -> Option<ResolvedCall> {
if let Some(ty) = typeinfo.func_types.get(head) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, this can be:

let ty = typeinfo.func_types.get(head)?;
let expected = ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}

assert!(resolved_call.len() == 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_eq!(resolved_call.len(), 1)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator Author

@yihozhang yihozhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezrosent Thanks for the awesome review. The review is super helpful for improving this PR and I really appreciate it. I should have addressed most of your comments.

Union(usize),
Extract(usize),
Panic(String),
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

instructions: Vec::new(),
};

for a in actions.0.iter() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right (also changed a few other places)

// Store the parser because it takes some time
// on startup for some reason
parser: ast::parse::ProgramParser,
pub(crate) expr_parser: ast::parse::ExprParser,
pub(crate) action_parser: ast::parse::ActionParser,
// yz (dec 5): Comment out since they are only used in terms.rs which are deleted
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think leaving them here makes Oliver's life easier later when he wants to add terms back, but I'm open to deleting them.

}
}

pub type Expr = GenericExpr<Symbol, Symbol, ()>;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pub(crate) type ResolvedExpr = GenericExpr<ResolvedCall, ResolvedVar, ()>;

#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Hash, Clone)]
pub enum GenericExpr<Head, Leaf, Ann> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expr/RawExpr/ResolvedExpr seems like a good idea. I will make a separate PR for the renaming.

rule: &CoreRule,
typeinfo: &TypeInfo,
) -> Result<(), TypeError> {
let CoreRule { head, body } = rule;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

}
}

// Different from `from_resolution`, this function only considers function types and not primitives.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; I changed them to pub(crate)

types: &[ArcSort],
typeinfo: &TypeInfo,
) -> Option<ResolvedCall> {
if let Some(ty) = typeinfo.func_types.get(head) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}

assert!(resolved_call.len() == 1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

))
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a MappedExpr type (with brief documentation) as a shorthand

Comment on lines -205 to -218

/// Returns true if the name is in the form v{digits}__
/// like v78___
///
/// Checks for pattern created by Desugar.get_fresh
fn is_temp_name(&self, name: String) -> bool {
let number_underscores = self.desugar.number_underscores;
let res = name.starts_with('v')
&& name.ends_with("_".repeat(number_underscores).as_str())
&& name[1..name.len() - number_underscores]
.parse::<u32>()
.is_ok();
res
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking to see if temporary functions are filtered out in another way now for serialization.

If they are included, it makes the visualization rather noisy!

tests/files.rs Outdated Show resolved Hide resolved
}
var
});
pub fn map_def_use(&self, fvar: &mut impl FnMut(&Leaf, bool) -> Leaf) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can delete them then I think we should :)

Up to you though

src/core.rs Outdated Show resolved Hide resolved
src/core.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated
self.run_command(processed, should_run)?;
for processed in self.process_command(command)? {
if self.run_mode.show_egglog() {
// This is a hack: we have to do this because process_commands are indeed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having trouble following what's happening here.

My read here is that show_egglog() is essentially a "dry run" mode where the only commands we want to run are the ones that we need to make everything well-scoped (Push/Pop).

But then how does that relate to typechecking a function twice? When would we do this twice? And aren't we still typechecking in process_command? ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I was specifically talking about "typechecking a function twice" because, in our test, we have code like

(sort Expr)
(push)
(function f () Expr)
(pop)
(push)
(function f () Expr)
(pop)

It is only when (push) and (pop) are run that this does not lead to an error. But I agree that "typechecking twice" is not fundamental to the issue here. I'll update the comment to reflect this.

src/main.rs Outdated Show resolved Hide resolved
}

impl PrimitiveLike for ValueEq {
fn name(&self) -> Symbol {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional:

A while back, I noticed some performance problems related to calling "string constant".into() all over the place. It was faster / cheaper to have a lazy_static! that cached the intern value and then loading that here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@yihozhang yihozhang requested a review from oflatt January 23, 2024 19:31
Copy link
Member

@oflatt oflatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to merge! Amazing work 🦸

src/actions.rs Show resolved Hide resolved
@yihozhang yihozhang merged commit 3b83e39 into main Jan 25, 2024
6 checks passed
@saulshanabrook saulshanabrook deleted the yihozhang-type-annotation branch January 26, 2024 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query atoms are order-sensitive Created map type is always first defined map type Type Inference Failure
4 participants