-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests use lots of disk space after recent CockroachDB upgrade #1004
Comments
@bnaecker had reported under #988 that he was seeing test runs fail with ENOSPC in tmpfs. In chat he reported:
@jgallagher reported several CI failures due to running out of disk space: one, two. I'd noticed the ballast file stuff in the new release but hadn't connected it to this problem. These files are supposed to used to recover from out-of-disk scenarios. The idea is that by reserving space up front that can be freed later, one can avoid using every last byte of disk space with no way to recover. Ironically, they don't work on ZFS. We discovered this because Ben said
I suspected this might be related to sparse files, so I truss'd
That's not sparse at all. Nor do I see any use of
With compression on (of any kind), zfs compresses zero-filled blocks:
This explains why I didn't notice this problem while digging into Ben's report on #988. I was checking the file's size on ZFS, and I run my tests with TMPDIR set to a ZFS directory, so I didn't see this. We got distracted a bit looking at sfd::fs::copy, which unfortunately does not preserve sparseness of files. We thought maybe the file had been created sparse in the seed directory, then copied in a way that didn't preserve sparseness when running the actual tests. But this isn't a real sparse file, so I think that's unrelated. So the current thinking is:
|
tl;dr: Under #988 we updated CockroachDB to v20.2.9, which includes automatic creation of ballast files. Under some conditions (including our CI runners and tests run with a normal tmpfs), this causes disk space usage in tmpfs on the order of 1 GiB per concurrent test. People are seeing the test suite fail with ENOSPC both locally and in CI. @jgallagher is looking at disabling ballast files for the test suite, since they serve no purpose there.
The text was updated successfully, but these errors were encountered: