Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Lean 4 support #76

Merged
merged 31 commits into from
Jun 12, 2022
Merged

Implement Lean 4 support #76

merged 31 commits into from
Jun 12, 2022

Conversation

insightmind
Copy link
Contributor

@insightmind insightmind commented Dec 9, 2021

General

This pull request adds support for Lean 4 using LeanInk.

In order to do so, it introduces a new driver lean4.py for handling communication between Alectryon and LeanInk.
LeanInk takes a .lean file as its input and analyzes it using direct access to the Lean 4 compilers AST. It returns an output file that contains all fragments (Text and Sentences) in Alectryon's json format.

Usage

Lean code files

Lean 4 usage is equivalent to any other supported language. You can use the lean4 or lean4+rst driver to explicitly specify Lean 4.
With this PR .lean files will also automatically be inferred as Lean 4 files instead of Lean 3.

alectryon --frontend lean4 plain-lean4.lean

To print the debug output of LeanInk you can do the following:

alectryon --frontend lean4 plain-lean4.lean --debug

If you want to use Alectryon with a Lean 4 file that uses the Lake package manager you can provide the lakefile to Alectryon:

alectryon --frontend lean4 plain-lean4.lean --lake lakefile.lean

Literate files

Lean 4 is also supported in literate files using the lean4:: directive for reStructuredText and {lean4} for MyST markdown files.

Finally we also added a short explainer section about Lean 4 support in the readme.

New Recipes

We have added a few new recipes to test and showcase the Lean 4 support:

  • plain-lean4.lean
  • literate-lean4.lean
  • lean4-tactics.rst
  • lean4-tactics-myst.md

These files make up most of the added files.

Important notes

To support Lean 4 we had to make minor changes to the implementation of Lean 3 support.
We renamed the recipe plain.lean to plain-lean3.lean.
Additionally .lean files will automatically be inferred as Lean 4 files instead of Lean 3 because it has a higher precedence.

Feedback

Feedback on the implementation is always welcome. Especially if you can spot rough issues or find features that are not yet supported.

Most line additions are due to new recipe files. If you take a look at the python files, the actual implementation is very small.

@insightmind insightmind changed the title WIP: Implement Lean 4 support Implement Lean 4 support Dec 9, 2021
@cpitclaudel
Copy link
Owner

Excellent work @insightmind! I will make a detailed review this week-end, most likely on Sunday (please ping me Saturday evening if you haven't heard back by then). I just had a quick read-through and everything looks very reasonable, so I don't think there will be much to change before merging :)

@cpitclaudel
Copy link
Owner

Btw, since Lean support line comments, we could consider using line comments for literate programming in Lean. That's how I did it in F* a while back, see https://github.com/FStarLang/fstar-mode.el/tree/master/etc/fslit/ . If you think that would be a good fit for Lean then we could import the state machine from fslit.

Oh, and feel free to suggest something better than /-|, I chose the | in Coq because it was relatively easy to type with (* and because (**, (*!, and (*+ were already taken.

@Kha
Copy link

Kha commented Dec 9, 2021

Oh, and feel free to suggest something better than /-|, I chose the | in Coq because it was relatively easy to type with (* and because (**, (*!, and (*+ were already taken.

I don't mind /-|, though for consistency with docstrings /-- ... -/ and module docstrings /-! ... -/, it would be nice to use plain -/ as the terminator.

@insightmind insightmind marked this pull request as ready for review December 15, 2021 19:19
@insightmind
Copy link
Contributor Author

Btw, since Lean support line comments, we could consider using line comments for literate programming in Lean. That's how I did it in F* a while back, see https://github.com/FStarLang/fstar-mode.el/tree/master/etc/fslit/ . If you think that would be a good fit for Lean then we could import the state machine from fslit.

Oh, and feel free to suggest something better than /-|, I chose the | in Coq because it was relatively easy to type with (* and because (**, (*!, and (*+ were already taken.

We discussed the idea recently and came to the conclusion that multiline comments are a better fit for Lean. The reason we think so is that Lean already supports docstrings and module docstrings, as described above, using multiline comments. Literate programming using multiline comments is therefore more consistent with the current language.

@Kha
Copy link

Kha commented Feb 11, 2022

please ping me Saturday evening if you haven't heard back by then

Slightly later ping :)

Copy link
Owner

@cpitclaudel cpitclaudel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks excellent! I've been delaying this review for much too long because I thought it'd need quite a bit of time, and it took just 30 minutes to go through everything. Congrats!!

I left a few small comments. None of them should be time consuming to fix up.

Have you run the code through make lint to see if any complaints pop up?

Also, let's update the changelog, and make a release soon after we merge this.

Again, exceptional work! Thanks :)

alectryon/cli.py Outdated
@@ -378,6 +378,7 @@ def write_file(ext, strip):

EXTENSIONS_BY_LANGUAGE = {
"coq": (".v",),
"lean4": (".lean",),
"lean3": (".lean", ".lean3"),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With .lean appearing under both lean3 and lean4 lists the CODE_EXTENSIONS list below will contain .lean twice, which may cause issues. A quick review suggests that CODE_EXTENSIONS could simply be turned into a set. We should review other uses of EXTENSIONS_BY_LANGUAGE, too, like in the definition of FRONTENDS_BY_EXTENSION. It may be enough to remove the lean3 entry from that explicitly after constructing the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would removing the .lean extension from the .lean3 case be enough?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, adjusted it.

.. lean4::

#check Nat

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove these three examples; that will be one less spot to update when we add other languages

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.


def annotate(self, chunks):
document = EncodedDocument(chunks, "\n", encoding="utf-8")
self.resolve_lake_arg()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function needed now that there is a top-level argument this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is required in order for us to figure out the working directory of LeanInk so all external references are loaded correctly. This behavior is partially handled by the Lean 4 driver.

self.resolve_lake_arg()
result = self.run_leanInk_document(document)

if not result:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this case correspond to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a guard statement to prevent an error that would happen when accessing the last element of an empty array.

last = result[-1]

# Sometimes we require an additonal \n and sometimes not. I wasn't really able to
# find out exactly when, but this workaround seems to work for almost all cases.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work to ensure that there's a newline at the end of every fragment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if that is enough or correct tbh because we can't be sure if a single newline at the end of a fragment is enough, for example if we have a fragment that has two or more new lines at the end.
However LeanInk should ensure that fragments are generated so that there are no missing characters in between them, so these cases shouldn't be an issue.

I suspect that either the encoding or decoding process of LeanInks input or output causes the issue, or the Lean compiler trims the end of the file in some cases, which would cause the missing new line character.

@insightmind
Copy link
Contributor Author

Thank you so much for your thorough review.

I resolved the linter warnings, and adjusted the code based on your review + responded to the comments.

I have also added two new transforms in the transform.py file for Lean 4.
This is because of a recent change within LeanInk to support nested tactics.
Therefore I also regenerated the output files of all Lean 4 recipe examples.

Also let me know if there's anything else that might need some adjustments.

P.S.: Do you want to update the changelog after the merge or should I already update it in this PR?

Copy link
Owner

@cpitclaudel cpitclaudel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, a few very minor things, and I think we're ready to merge :) Thanks again for this excellent work!

@cpitclaudel cpitclaudel merged commit 032166f into cpitclaudel:master Jun 12, 2022
@cpitclaudel
Copy link
Owner

Hi folks, I finally found the time to go through the PR for a final cleanup, and to merge it! :) I had to rebase a few things, so I saved a copy of the unrebased branch to lean4-before-rebase in this repository.

@insightmind , big congratulations! :) And @Kha and @insightmind , thanks both a lot for your patience.

@insightmind , I've sent you an invitation to get you the commit bit, so if you want to make further changes to the lean part you should feel free to either make a PR or push directly to master.

Cheers!

@cpitclaudel
Copy link
Owner

I've installed Lean4 locally to test things, so I'm fairly confident that they work, but I'd love a last go-ahead from either of you before I make a release ^^

@Kha
Copy link

Kha commented Jun 18, 2022

diff says output is unchanged on the Lean 4 doc examples, looks good :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants