Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stub page on NL translation #279

Closed
wants to merge 1 commit into from
Closed

stub page on NL translation #279

wants to merge 1 commit into from

Conversation

robertylewis
Copy link
Member

This will be linked to from leanprover-community/doc-gen#168

cc @zhangir-azerbayev , any suggestions for the wording?

@zhangir-azerbayev
Copy link

Here's a rough suggestion. Let me know if you want it to be shorter/longer or if there are parts that are unclear.

The mathlib docs is testing a beta feature that generates informal translation of mathlib theorem statements using OpenAI's Codex language model. To see the informal version of a theorem, click on the "informal statement" drop down below each lemma and theorem. These informal translations are automatically generated by a machine learning algorithm and should not be trusted.

We would greatly appreciate your feedback on the quality of informal translations so that we can improve this system in the future.

Codex is a causal language model, which means that given a string of text, its objective is to predict the next word. The model is trained to maximize next word prediction accuracy on the corpus of all Github source code. By iterating the next-word prediction task on a user's input, Codex can generate coherent text and code. To instruct Codex to perform the informal translation task, we used a technique called few shot prompting. Suppose we want the model to translate your favorite informal theorem into natural language. We would show the model a few (in our case, four) examples of formal-informal pairs, followed by the formal statement of your favorite theorem as:

Lean mathlib version:
theorem exists_le_sylow {p : ℕ} {G : Type*} [group G] {P : subgroup G}
(hP : is_p_group p P) :
∃ (Q : sylow p G), P ≤ Q :=
Translate the Lean mathlib version to a natural language version:
"Let $P$ be a $p$-subgroup of $G$. Then $P$ is contained in a Sylow $p$-subgroup of $G$."

Lean mathlib version:
theorem subset_of_open_subset_is_open (X : Type*) [topological_space X]
(A : set X) (hA : ∀ x ∈ A, ∃ U : set X, is_open U ∧ x ∈ U ∧ U ⊆ A):
is_open A :=
Translate the Lean mathlib version to a natural language version:
"Let $X$ be a topological space; let $A$ be a subset of $X$. Suppose that for each $x\in A$ there is an open set $U$ containing $x$ such that $U\subset A$. Then $A$ is open in $X$."

-- A few more examples go here

Lean mathlib version:
-- Your favorite formal theorem goes here

Picking up on the pattern, Codex will try to complete the text with an informal translation of your favorite formal theorem.

@PatrickMassot
Copy link
Member

It seems this PR did not converge and is no longer relevant, so I will close it. But do not hesitate to reopen if this project is still moving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants