Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick question #3150

Closed
GinoGiotto opened this issue Apr 29, 2023 · 16 comments
Closed

Quick question #3150

GinoGiotto opened this issue Apr 29, 2023 · 16 comments

Comments

@GinoGiotto
Copy link
Contributor

Lately you probably noticed that I'm shortening proofs with the minimizer and you also probably noticed that I'm separating the job into multiple pull requests to make the review process humanly feasible.

My question is: would you like a big pull request nicely separated into multiple commits? Each of them would correspond to an analysed section of set.mm, and they would be nicely ordered without intermediate commits (such as fixes or mistake corrections). For me this would make the job quicker.

As usual the commits would only contain proofs automatically minimized with the command MINIMIZE_WITH * /ALLOW_NEW_AXIOMS * /NO_NEW_AXIOMS_FROM ax-* and nothing else. Let me know if you like this proposal.

@david-a-wheeler
Copy link
Member

Overall, yes. In general it's best to divide things into multiple commits. That makes it easier to review, for one thing. Also, if someone changes something else, that may create a merge conflict - merge conflicts can be easier to deal with when the changes are smaller.

Thanks for being willing to work on this!

@GinoGiotto
Copy link
Contributor Author

Thanks for the answer! But maybe I did not express myself properly.

I meant doing fewer pull requests, but with many commits inside them. Each pull request would be very big (much bigger than the ones I did recently), but since they would contain many commits they would be still reviewable.

The reason I ask this is because the estimated amount of pull requests at the current rate would be around 40, and it's a bit time consuming, but with the proposed method I would reduce them to only 2 or 3.

@david-a-wheeler
Copy link
Member

I'd much rather have many separate pull requests, each of which is smaller. Smaller changes are easier to process. A PR with many changes increases the likelihood of a merge conflict. Also, even if there's a merge conflict, with smaller changesets we can easily merge the PRs that don't conflict.

PRs don't cost anything. You can have 40 PRs instead of 2, that sounds great & I'd prefer it.

Anyone else?

@wlammen
Copy link
Contributor

wlammen commented Apr 30, 2023

Apart from technical problems like merge conflicts, I'd like to add: I do not work as a professional for Metamath, which means I need to find spare time to do a review, for instance. It is far easier for me to set 15 min aside for a small task than several hours for a big one. In addition a big task may cause my attention to decrease over time, so the scrutiny will suffer in the end.

I am much in favor of smaller PRs.

@GinoGiotto
Copy link
Contributor Author

Alright then, by the way I was looking at this pull request #3049 and found this suggestion #3049 (review). My pull requests are becoming quite big too despite the splitting, so maybe an automatic check for axiom usage could be helpful? I don't know how to implement it tho.

@david-a-wheeler
Copy link
Member

Automated checks for changes to axiom use sound like a good idea. However, as I noted:

We definitely need better tooling for this. The verifiers currently only look at "current state" instead of comparing state. There's probably a way to do that, I just haven't checked.

The CI/CD tools can only see "current state" not "previous state". We around this with "discouraged" uses by storing, in the repo itself, the currently-allowed uses of discouraged statements (and thus we can see when it changes). We could use the same trick for axioms, but I don't think we should do that, because in most cases we expect the axiom list to change.

We could create a simple system that output, for each statement, the set of axioms ax-* that it depends on. Run that command before & after the simplification, and the results should be the same. Then we'd have confidence that nothing accidentally went wrong. Sound reasonable?

@GinoGiotto
Copy link
Contributor Author

We could create a simple system that output, for each statement, the set of axioms ax-* that it depends on.

I just discovered that with the match option we could do exactly that. So the command:
SHOW TRACE_BACK xxx /AXIOM /MATCH ax-* will only return the true axioms used by the xxx statement without showing definitions, which is quite handy.

Overall this sounds good to me, there are some steps in the procedure that I'm not sure how they would be implemented, but that's probably because of my ignorance/inexperience.

My beginner questions are three:

  • How can we check axiom usage before the semplification? (can we run the commands for the previous commit?)
  • How will newly added or newly removed theorems be approached?
  • How can we tell the tools to run the axiom check to only the theorems modified in the PR? (I think using the wildcard * would be quite slow)

@david-a-wheeler
Copy link
Member

On a POSIX system (Linux & MacOS), do this.

Before the change, run this command to store results in ,old:

 metamath 'read set.mm' 'set scroll continuous' 'show trace_back * /axiom /match ax-*' > ,old

After the change, run the same command but store it in ,new:

 metamath 'read set.mm' 'set scroll continuous' 'show trace_back * /axiom /match ax-*' > ,new

Now show the differences:

diff -u ,old ,new

Warning: in metamath-exe running show trace_back on every statement takes a long time. It starts with each statement & works backward. That process could be made much faster by caching intermediate results, but metamath doesn't currently do that. But if you don't mind it taking a while, it would certainly work, and we could optimize that in the future if desired.

@david-a-wheeler
Copy link
Member

How can we tell the tools to run the axiom check to only the theorems modified in the PR? (I think using the wildcard * would be quite slow)

You'd need to identify the theorems modified. You could do only ones in a range, and then specify only the ones in a range.

Another approach is to not worry about it, and just let it run a lot time (the "What, me worry?" approach).

@icecream17
Copy link
Contributor

icecream17 commented Apr 30, 2023

For curiosity's sake I ran time metamath "set scroll c" "read set.mm" "show trace * /match ax-* /ax" exit and got
image

real    28m48.232s
user    21m30.267s
sys     0m4.019s

I was doing other stuff while waiting, and this is an outdated version of metamath.exe, but that's about 25.17 theorems per second.

@david-a-wheeler
Copy link
Member

david-a-wheeler commented Apr 30, 2023

Sounds right, it takes about 30 minutes.

Since this is not an analysis we need to do often, I think we can live with it. Do a before & after, verify it didn't change anything important, and off we go. It doesn't need to be part of the CI/CD pipeline, in fact, it can't because it's too slow.

We could make this nearly instantaneous in metamath-exe by caching results as we went. Currently each theorem has to re-figure-out the answers for everything it depends on, so the same answers get recomputed many times. That's not hard to cache, but it's an extra step (along with a little more memory), so that's only worth doing if we do it often.

@digama0
Copy link
Member

digama0 commented May 1, 2023

This is very trivial to implement in metamath-knife, I would suggest doing it there if you care about the performance cost.

30 minutes is IMO absolutely unacceptable, the axiom use list for every theorem in set.mm can easily be done in <1s.

@digama0
Copy link
Member

digama0 commented May 1, 2023

I added a mode to do this to metamath-knife. As predicted it is quite efficient: I can generate the whole file in about 200ms, resulting in a 9MB axiom use file for set.mm.

@GinoGiotto
Copy link
Contributor Author

I added a mode to do this to metamath-knife. As predicted it is quite efficient: I can generate the whole file in about 200ms, resulting in a 9MB axiom use file for set.mm.

Thanks! This will be very helpful. I will be able to quickly check axiom usage myself without making reviewers wasting their time. I'll think about a system to report the results on github so you won't have to "trust" me, I'll let you know.

@GinoGiotto
Copy link
Contributor Author

An idea could be to collect all minimization commits in a single branch of my repository, so that there won't be any need to check axiom usage 40 times in the future. It would suffice to scan that branch only so that all future pull requests would be already verified.

I could also add in my fork the axiom list after all minimizations and see the differences with the generated file from the set.mm database in my develop branch. What do you think?

@GinoGiotto
Copy link
Contributor Author

I am closing this issue since the questions I raised have been addressed and solutions have been provided. Also it seems unlikely that further developments are needed in this discussion.

Thanks for your partecipation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants