Remove ModTime check during build (#5125) #5351

aschmois · 2020-07-27T19:55:50Z

As the issue (#5125) suggests, modified time makes using a stack cache in CI very difficult (if at all possible). This PR suggests making a change in how the build process marks a file as dirty.

Summary of changes:

When checking the build cache, always create a digest and use that to check against the cache if it exists.
- Note: let's discuss here about also checking for size.
When adding unlisted files to the build cache don't check for pre build time.
- This was a strange call to make to begin with and looking into the commit history it looks like it was a patch for an issue about Template Haskell (Dirtiness checking triggers unnecessary reinstalls WAS: stack build unnecessarily reinstalls packages from github #838). Removing this check does not break the fix. I ran the two test cases mentioned in the issue with latest stack version and with the modified one from this PR and it did not cause a regression.

Questions and comments:

I wasn't able to figure out the styling, hindent seems to change too many lines. I tried to keep it as stable as possible.
After this change mod time becomes obsolete (no uses), should we remove it? I wasn't sure how it would affect the json instance supplied to FileCacheInfo.
If checking against size is not required removing size would also cleanup the process a lot more (no necessary composed tuples).
In terms of performance I built our team's current library with about 1500 modules and there is negligible time difference between the runs. Creating the digest is pretty fast! I do however realize that I run this on a beasty machine so it would be nice to get some insight here.

aschmois · 2020-07-28T12:01:34Z

I don't think the windows integration test failure is correct, looks like it failed in the dependencies section.

snoyberg · 2020-08-02T10:35:57Z

I would prefer to augment rather than remove the modtime check. I would want the heuristics to be something like:

If the file modification is older, assume that the file is the same
If the file modification is newer, then check the digest and make a determination

That should bypass most of the slowdown potentially instituted by making the digest checks only occur (1) on creating the cache and (2) when the file mod times are incorrect

tfausak · 2020-08-04T02:33:44Z

I worked with @aschmois on this. I expected there to be a slowdown from replacing the modification time check with a digest comparison. I was surprised to find that if there was a difference, it was lost in the noise (at least for our project). Is there a large project we could test the relative performance with?

Conceptually I like the idea of only relying on the file's contents rather than its metadata. But also of course a nice concept isn't any good if it's too slow.

snoyberg · 2020-08-04T02:38:51Z

I would test on a monorepo like yesodweb/yesod. But even if testing shows no major performance impact, I'd still be worried that some users, based on type of hard drive, file system, cache settings, etc, would end up having a negative impact.

aschmois · 2020-08-04T12:08:21Z

I can try and test on different configurations and post the findings, I have a few different devices I can try to build on.

aschmois · 2020-08-04T20:49:14Z

With the help of @gera-cameron I ran the tests below, please let me know if something looks off about testing methods.

AWS EC2 Testing

Ran on ec2 instance types m5.large. Specifically avoiding the burstable instances to have stable build times. One test run on spinning disks and another on ssd.

Up to 3.1 GHz Intel Xeon® Platinum 8175M processors with new Intel Advanced Vector Extension (AVX-512) instruction set.

https://aws.amazon.com/ec2/instance-types/

Setup:

$ curl -sSL https://get.haskellstack.org/ | sh # install 2.3.1
$ git clone --recurse-submodules http://github.com/yesodweb/yesod
$ cd yesod && stack test && cd - # download ghc and build dependencies
$ git clone http://github.com/aschmois/stack # clone PR
$ cd stack && stack test --copy-bins --local-bin-path bin --ghc-options '-O2' && cd - # build stack and copy binary into ./bin/
$ cp stack/bin/stack yesod/stackx && chmod u+x yesod/stackx # copy modified stack binary as stackx and make it executable
$ cd stack && git checkout b5d30906ebee25df1f2532255e245d329083b623 && stack test --copy-bins --local-bin-path bin --ghc-options '-O2' && cd - # build unmodified stack
$ cp stack/bin/stack yesod/stackz && chmod u+x yesod/stackz # copy unmodified stack binary as stackx and make it executable

Test 1

Each stack binary test started from cold boot then run sequentially. We don't expect numbers to change wildly here since they all start from a clean install

Yesod Build Stack unmodified:

$ stack clean --full && TIMEFORMAT='%6R'; time ./stackz build

Yesod Build Stack modified:

$ stack clean --full && TIMEFORMAT='%6R'; time ./stackx build

Results

All results are in seconds

OS	EC2	vCPU	RAM	EBS	version	x1 (no io cache)	x2	x3	avg
Ubuntu 20.04	m5.large	2	8	standard (magnetic)	unmodified	130.105	107.735	108.894	115.578
Ubuntu 20.04	m5.large	2	8	standard (magnetic)	modified	128.374	109.875	105.994	114.738
Ubuntu 20.04	m5.large	2	8	gp2 (ssd) 100 iops	unmodified	109.004	103.735	104.614	105.785
Ubuntu 20.04	m5.large	2	8	gp2 (ssd) 100 iops	modified	108.405	104.605	104.024	105.678

Summary

All of these numbers are within margin of error of each other and we can assume that the same process is happening on each build. This is because digest is being calculated every time since the project was cleaned.

Test 2

This is where we expect things to show differences since no stack clean is done after the first one.

Yesod Build Stack unmodified:

$ stack clean --full && ./stackz build
$ export TIMEFORMAT='%6R'
$ time ./stackz build
$ time ./stackz build
$ time ./stackz build

Yesod Build Stack modified:

$ stack clean --full && ./stackx build
$ export TIMEFORMAT='%6R'
$ time ./stackx build
$ time ./stackx build
$ time ./stackx build

Results

All results are in seconds

OS	EC2	vCPU	RAM	EBS	version	x1	x2	x3	avg
Ubuntu 20.04	m5.large	2	8	standard (magnetic)	unmodified	0.614	0.614	0.604	0.611
Ubuntu 20.04	m5.large	2	8	standard (magnetic)	modified	0.626	0.635	0.634	0.632
Ubuntu 20.04	m5.large	2	8	gp2 (ssd) 100 iops	unmodified	0.594	0.594	0.594	0.594
Ubuntu 20.04	m5.large	2	8	gp2 (ssd) 100 iops	modified	0.606	0.604	0.604	0.605

Summary

We notice a ~21ms difference in magnetic drives calculating digests on a very restricted machine and ~11ms difference in ssd drives.

Based off these results I think we should only use digests since mod times can bring more bugs (such as the CI one I ran into) and performance does not seem to be majorly affected.

snoyberg · 2020-08-05T03:53:55Z

Based on the massive communication misconnect here, I think there's a fundamental misunderstanding here. I think I understand it, but before merging, let's confirm. I said above:

I would prefer to augment rather than remove the modtime check.

followed by:

But even if testing shows no major performance impact, I'd still be worried that some users, based on type of hard drive, file system, cache settings, etc, would end up having a negative impact.

Given that despite these comments, you've moved ahead with a whole bunch of performance testing, I think you're trying to imply:

You're wrong Michael, augmenting is not an option, so let me try to convince you with overwhelming evidence that the performance impact is minimal.

Am I reading this conversation correctly?

tfausak · 2020-08-05T11:46:10Z

We don't think you're wrong, and we're not trying to overwhelm you with benchmarks.

Like you, we expected the performance of this digest-based approach to be worse than the current approach based on file modification times. After working on this patch we were pleasantly surprised to find that it made effectively no impact whatsoever for our use case. Since your primary concern appeared to be performance, we tried to run a benchmark where we stacked the deck against ourselves: Large project, shared hosting, under powered machine, and spinning disks. Even with those unfavorable conditions we only saw about a 20 ms penalty, which is about 3% of the build time.

Augmenting is very much an option. However we thought that the approach presented in this PR is both conceptually simpler and results in code that's easier to read, so we figured it was worth a shot. If you're saying that this approach is dead in the water even if the performance impact is minimal, fine.

snoyberg · 2020-08-05T12:03:50Z

Sorry, I didn't mean to imply you're overwhelming me, I just meant that the evidence is overwhelmingly in favor of what you're saying.

I am slightly concerned still, but I'm willing to take this as-is and see if anything complains about their hard drives thrashing later. Thank you!

snoyberg · 2020-08-05T12:04:18Z

Sorry, just one more request: can you update the ChangeLog?

aschmois · 2020-08-05T12:13:13Z

I'd like to try to smooth out the situation, in no way do I want to attack anyone nor do any harm to this code base. I apologise if I came out like that, I can be a little cold sometimes when discussing code. I thought making fancy benchmarks would make a better case for the code written not worse.

With that out of the way if we are moving forward with this I'd like to see if removing the mod time and size (if we don't need it as part of the digest check) is also something you think can be done I'd like to do so to cleanup the tuple composing.

An afterthought, augmenting mod time does seem to be the best option in terms of safety but I think after looking at the benchmarks being more data safe can avoid some future bugs. I've always seen horror stories around mod time checks for caching and have honestly been battling wiht this bug for over a year; not directly but it has been at the back of my mind for that long 😅 . I really want what's best for stack not any random code! Please let me know what you think.

snoyberg · 2020-08-05T12:29:23Z

Really, honestly, nothing to apologize for. I just wanted to hone in on whether there was a correctness issue I was missing, or if there was a different reason. Taylor clarified the situation to my satisfaction in the comment above.

IIUC, we're not using the filesize or timestamp at all, so please do feel free to update the PR by removing them. I'm also not a fan of anonymous tuples in general, so either removing the multiple-data-returns or creating a custom datatype if they are needed would be great.

Thanks!

snoyberg

Thanks!

Andres Schmois added 2 commits July 27, 2020 19:45

Remove ModTime check during build (#5125)

d6a6f52

Make hlint happy

e1e81ca

Andres Schmois added 2 commits August 5, 2020 13:16

Update changelog

151c39c

Remove modtime and size from build cache

4ced806

snoyberg approved these changes Aug 6, 2020

View reviewed changes

snoyberg merged commit 94ec44a into commercialhaskell:master Aug 6, 2020

aschmois mentioned this pull request Aug 6, 2020

Cache being busted after restoring directory #5125

Closed

aschmois mentioned this pull request Sep 11, 2020

Upgrade Stack EdutainmentLIVE/docker-stack#3

Closed

hdgarrood mentioned this pull request Feb 26, 2021

Update the build cache after building every module purescript/purescript#3996

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove ModTime check during build (#5125) #5351

Remove ModTime check during build (#5125) #5351

aschmois commented Jul 27, 2020 •

edited

Loading

aschmois commented Jul 28, 2020

snoyberg commented Aug 2, 2020

tfausak commented Aug 4, 2020

snoyberg commented Aug 4, 2020

aschmois commented Aug 4, 2020

aschmois commented Aug 4, 2020 •

edited

Loading

snoyberg commented Aug 5, 2020

tfausak commented Aug 5, 2020

snoyberg commented Aug 5, 2020

snoyberg commented Aug 5, 2020

aschmois commented Aug 5, 2020

snoyberg commented Aug 5, 2020

snoyberg left a comment

Remove ModTime check during build (#5125) #5351

Remove ModTime check during build (#5125) #5351

Conversation

aschmois commented Jul 27, 2020 • edited Loading

aschmois commented Jul 28, 2020

snoyberg commented Aug 2, 2020

tfausak commented Aug 4, 2020

snoyberg commented Aug 4, 2020

aschmois commented Aug 4, 2020

aschmois commented Aug 4, 2020 • edited Loading

AWS EC2 Testing

Setup:

Test 1

Results

Summary

Test 2

Results

Summary

snoyberg commented Aug 5, 2020

tfausak commented Aug 5, 2020

snoyberg commented Aug 5, 2020

snoyberg commented Aug 5, 2020

aschmois commented Aug 5, 2020

snoyberg commented Aug 5, 2020

snoyberg left a comment

Choose a reason for hiding this comment

aschmois commented Jul 27, 2020 •

edited

Loading

aschmois commented Aug 4, 2020 •

edited

Loading