Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite makefile to be reliable #4034

Merged
merged 7 commits into from
Mar 9, 2021
Merged

Rewrite makefile to be reliable #4034

merged 7 commits into from
Mar 9, 2021

Conversation

Groxx
Copy link
Member

@Groxx Groxx commented Mar 3, 2021

Phase 2 of makefile cleanups!

I've added some simple docs to the makefile, hopefully they're easy to follow.
For higher-level docs and general Make education (e.g. why, exactly, I recommend so strongly about never "making" source files)... I need to find a location, and it'll take a little longer to write. I'm kinda fond of a makefile.md or architecture.md but I'm not yet sure where to put it.

But as a tl;dr, what this PR does is:

  • makes the high-level build steps explicit so they're easier to follow (right near the top)
  • switches to Make-only book-keeping files, so steps are reliably recalculated in all situations
  • moves tool-bins to a .bin folder so they are not deleted by default, and versions them more strictly (you can now switch between versions for "free", great for e.g. trying new protocs)
  • adds a reliable way to avoid codegen in buildkite / wherever it's needed or desired

As a bonus, this should now be safe to always run in parallel, which runs about 2x as fast for me:

> time make bins
...
make bins  104.18s user 38.68s system 310% cpu 45.941 total

> time make bins -j8
...
make bins -j8  108.43s user 53.00s system 717% cpu 22.489 total

I'll change Buildkite to do this in a bit, when it has been battle-tested a bit more.

Next (and final) phase is to make test-running a bit more normal, but that needs a bit more work.


As a brief summary about why book-keeping files are important, be aware of these two details:

  1. Make only looks at file modification times to determine what to do. If A depends on B and B is newer than A (or A does not exist), it'll re-build A.
  2. Git (and other tools) do not record modification times, to work better with Make. This is why touch go.mod doesn't lead to anything to commit - git doesn't record when you modified it, only what the modifications were.

Because of this, on e.g. a brand new git clone, all files are "the same" age... with some randomness about what ones are newer than others, within e.g. milliseconds. So as far as Make is concerned, a bunch of random files have been modified. If you had makefile rules like output.go: input.go \n do codegen, some of those output files may be newer than their inputs, so they may or may not be re-made.

That's pretty clearly wrong. The same kind of thing happens when you change branches - depending on what changed between your before/after commits, those files are now "new".

The way out of this madness is to rely on files that e.g. Git does not modify: these book-keeping files.

  • When cloning (or clean), they do not exist, so they must all be re-made.
  • When changing branches, any changed files are always newer than these book-keeping files, so they will always be re-made (when appropriate).

So that's what this PR does.

@coveralls
Copy link

coveralls commented Mar 3, 2021

Coverage Status

Coverage decreased (-0.02%) to 66.725% when pulling 950347a on rewrite_makefile into 1e8b738 on master.

@Groxx Groxx changed the title Rewrite makefile Rewrite makefile to be reliable Mar 6, 2021
@Groxx Groxx requested a review from a team March 6, 2021 03:49
@Groxx Groxx marked this pull request as ready for review March 6, 2021 03:50
Copy link
Contributor

@vytautas-karpavicius vytautas-karpavicius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looking forward using this.

Makefile Outdated
# - Targets should never, EVER be *actual source files*.
# Always use book-keeping files in $(BUILD).
# Otherwise e.g. changing git branches could confuse Make about what it needs to do.
# - Simiarly, prerequisites should be those book-keeping files,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Similarly

Makefile Outdated
#
# this is not fatal, we can just run 2x.
# to be fancier though, we can detect when *both* are run, and re-touch the book-keeping files to prevent the second run.
# this STRICTLY REQUIRES that `copyright` and `fmt` are mututally stable, and that copyright runs before fmt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: mututally

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bleh. I should hunt down a spellchecker extension. kinda surprised vscode doesn't come with one tbh, or use the OS's.

Groxx added 7 commits March 9, 2021 00:21
Strict and correct and parallelizable.
I think it even reads better.

Next up: fix tests, and document.
I did a cleanup pass, tested a bunch, seems reliable and ready to share.

Just for funsies, on clean builds (though the go module cache is populated):
```
> time make bins
...
make bins  104.18s user 38.68s system 310% cpu 45.941 total

> time make bins -j8
...
make bins -j8  108.43s user 53.00s system 717% cpu 22.489 total
```
Twice as fast when parallelized!
@Groxx Groxx merged commit 1b3436c into master Mar 9, 2021
@Groxx Groxx deleted the rewrite_makefile branch March 9, 2021 01:00
vytautas-karpavicius pushed a commit that referenced this pull request Mar 15, 2021
Phase 2 of makefile cleanups!

As a tl;dr, what this PR does is:
- makes the high-level build steps explicit so they're easier to follow (right near the top)
- switches to Make-only book-keeping files, so steps are reliably recalculated in all situations
- moves tool-bins to a .bin folder so they are not deleted by default, and versions them more strictly (you can now switch between versions for "free", great for e.g. trying new protocs)
- adds a _reliable_ way to avoid codegen in buildkite / wherever it's needed or desired

As a bonus, this should now be safe to always run in parallel, which runs about 2x as fast for me:
```
> time make bins
...
make bins  104.18s user 38.68s system 310% cpu 45.941 total

> time make bins -j8
...
make bins -j8  108.43s user 53.00s system 717% cpu 22.489 total
```
I'll change Buildkite to do this in a bit, when it has been battle-tested a bit more.

Next (and final) phase is to make test-running a bit more normal, but that needs a bit more work.

---

As a brief summary about why book-keeping files are important, be aware of these two details:
1. Make _only_ looks at file modification times to determine what to do.  If A depends on B and B is newer than A (or A does not exist), it'll re-build A.
2. Git (and other tools) _do not_ record modification times, to work better with Make.  This is why `touch go.mod` doesn't lead to anything to commit - git doesn't record _when_ you modified it, only what the modifications were.

Because of this, on e.g. a brand new `git clone`, _all_ files are "the same" age... with some randomness about what ones are newer than others, within e.g. milliseconds.  So as far as Make is concerned, a bunch of _random_ files have been modified.  If you had makefile rules like `output.go: input.go \n do codegen`, some of those output files may be newer than their inputs, so they may or may not be re-made.

That's pretty clearly wrong.  The same kind of thing happens when you change branches - depending on what changed between your before/after commits, those files are now "new".

The way out of this madness is to rely on files that e.g. Git does _not_ modify: these book-keeping files.
- When cloning (or clean), they do not exist, so they must all be re-made.
- When changing branches, any changed files are _always_ newer than these book-keeping files, so they will _always_ be re-made (when appropriate).

So that's what this PR does.
yux0 pushed a commit to yux0/cadence that referenced this pull request May 4, 2021
Phase 2 of makefile cleanups!

As a tl;dr, what this PR does is:
- makes the high-level build steps explicit so they're easier to follow (right near the top)
- switches to Make-only book-keeping files, so steps are reliably recalculated in all situations
- moves tool-bins to a .bin folder so they are not deleted by default, and versions them more strictly (you can now switch between versions for "free", great for e.g. trying new protocs)
- adds a _reliable_ way to avoid codegen in buildkite / wherever it's needed or desired

As a bonus, this should now be safe to always run in parallel, which runs about 2x as fast for me:
```
> time make bins
...
make bins  104.18s user 38.68s system 310% cpu 45.941 total

> time make bins -j8
...
make bins -j8  108.43s user 53.00s system 717% cpu 22.489 total
```
I'll change Buildkite to do this in a bit, when it has been battle-tested a bit more.

Next (and final) phase is to make test-running a bit more normal, but that needs a bit more work.

---

As a brief summary about why book-keeping files are important, be aware of these two details:
1. Make _only_ looks at file modification times to determine what to do.  If A depends on B and B is newer than A (or A does not exist), it'll re-build A.
2. Git (and other tools) _do not_ record modification times, to work better with Make.  This is why `touch go.mod` doesn't lead to anything to commit - git doesn't record _when_ you modified it, only what the modifications were.

Because of this, on e.g. a brand new `git clone`, _all_ files are "the same" age... with some randomness about what ones are newer than others, within e.g. milliseconds.  So as far as Make is concerned, a bunch of _random_ files have been modified.  If you had makefile rules like `output.go: input.go \n do codegen`, some of those output files may be newer than their inputs, so they may or may not be re-made.

That's pretty clearly wrong.  The same kind of thing happens when you change branches - depending on what changed between your before/after commits, those files are now "new".

The way out of this madness is to rely on files that e.g. Git does _not_ modify: these book-keeping files.
- When cloning (or clean), they do not exist, so they must all be re-made.
- When changing branches, any changed files are _always_ newer than these book-keeping files, so they will _always_ be re-made (when appropriate).

So that's what this PR does.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants