Skip to content

Commit 8ceea01

Browse files
committed
Auto merge of #88362 - pietroalbini:bump-stage0, r=Mark-Simulacrum
Pin bootstrap checksums and add a tool to update it automatically :warning: :warning: This is just a proactive hardening we're performing on the build system, and it's not prompted by any known compromise. If you're aware of security issues being exploited please [check out our responsible disclosure page](https://www.rust-lang.org/policies/security). ⚠️ ⚠️ --- This PR aims to improve Rust's supply chain security by pinning the checksums of the bootstrap compiler downloaded by `x.py`, preventing a compromised `static.rust-lang.org` from affecting building the compiler. The checksums are stored in `src/stage0.json`, which replaces `src/stage0.txt`. This PR also adds a tool to automatically update the bootstrap compiler. The changes in this PR were originally discussed in [Zulip](https://zulip-archive.rust-lang.org/stream/241545-t-release/topic/pinning.20stage0.20hashes.html). ## Potential attack Before this PR, an attacker who wanted to compromise the bootstrap compiler would "just" need to: 1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage. 2. Upload compromised binaries and corresponding `.sha256` files to `static.rust-lang.org`. There is no signature verification in `x.py` as we don't want the build system to depend on GPG. Also, since the checksums were not pinned inside the repository, they were downloaded from `static.rust-lang.org` too: this only protected from accidental changes in `static.rust-lang.org` that didn't change the `*.sha256` files. The attack would allow the attacker to compromise past and future invocations of `x.py`. ## Mitigations introduced in this PR This PR adds pinned checksums for all the bootstrap components in `src/stage0.json` instead of downloading the checksums from `static.rust-lang.org`. This changes the attack scenario to: 1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage. 2. Upload compromised binaries to `static.rust-lang.org`. 3. Land a (reviewed) change in the `rust-lang/rust` repository changing the pinned hashes. Even with a successful attack, existing clones of the Rust repository won't be affected, and once the attack is detected reverting the pinned hashes changes should be enough to be protected from the attack. This also enables further mitigations to be implemented in following PRs, such as verifying signatures when pinning new checksums (removing the trust on first use aspect of this PR) and adding a check in CI making sure a PR updating the checksum has not been tampered with (see the future improvements section). ## Additional changes There are additional changes implemented in this PR to enable the mitigation: * The `src/stage0.txt` file has been replaced with `src/stage0.json`. The reasoning for the change is that there is existing tooling to read and manipulate JSON files compared to the custom format we were using before, and the slight challenge of manually editing JSON files (no comments, no trailing commas) are not a problem thanks to the new `bump-stage0`. * A new tool has been added to the repository, `bump-stage0`. When invoked, the tool automatically calculates which release should be used as the bootstrap compiler given the current version and channel, gathers all the relevant checksums and updates `src/stage0.json`. The tool can be invoked by running: ``` ./x.py run src/tools/bump-stage0 ``` * Support for downloading releases from `https://dev-static.rust-lang.org` has been removed, as it's not possible to verify checksums there (it's customary to replace existing artifacts there if a rebuild is warranted). This will require a change to the release process to avoid bumping the bootstrap compiler on beta before the stable release. ## Future improvements * Add signature verification as part of `bump-stage0`, which would require the attacker to also obtain the release signing keys in order to successfully compromise the bootstrap compiler. This would be fine to add now, as the burden of installing the tool to verify signatures would only be placed on whoever updates the bootstrap compiler, instead of everyone compiling Rust. * Add a check on CI that ensures the checksums in `src/stage0.json` are the expected ones. If a PR changes the stage0 file CI should also run the `bump-stage0` tool and fail if the output in CI doesn't match the committed file. This prevents the PR author from tweaking the output of the tool manually, which would otherwise be close to impossible for a human to detect. * Automate creating the PRs bumping the bootstrap compiler, by setting up a scheduled job in GitHub Actions that runs the tool and opens a PR. * Investigate whether a similar mitigation can be done for "download from CI" components like the prebuilt LLVM. r? `@Mark-Simulacrum`
2 parents 13db844 + ea8b1ff commit 8ceea01

13 files changed

+677
-149
lines changed

Cargo.lock

+13
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,18 @@ dependencies = [
220220
name = "build_helper"
221221
version = "0.1.0"
222222

223+
[[package]]
224+
name = "bump-stage0"
225+
version = "0.1.0"
226+
dependencies = [
227+
"anyhow",
228+
"curl",
229+
"indexmap",
230+
"serde",
231+
"serde_json",
232+
"toml",
233+
]
234+
223235
[[package]]
224236
name = "byte-tools"
225237
version = "0.3.1"
@@ -1663,6 +1675,7 @@ checksum = "bc633605454125dec4b66843673f01c7df2b89479b32e0ed634e43a91cff62a5"
16631675
dependencies = [
16641676
"autocfg",
16651677
"hashbrown",
1678+
"serde",
16661679
]
16671680

16681681
[[package]]

Cargo.toml

+1
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ members = [
3535
"src/tools/expand-yaml-anchors",
3636
"src/tools/jsondocck",
3737
"src/tools/html-checker",
38+
"src/tools/bump-stage0",
3839
]
3940

4041
exclude = [

src/bootstrap/bootstrap.py

+70-69
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import datetime
55
import distutils.version
66
import hashlib
7+
import json
78
import os
89
import re
910
import shutil
@@ -24,19 +25,17 @@ def support_xz():
2425
except tarfile.CompressionError:
2526
return False
2627

27-
def get(url, path, verbose=False, do_verify=True):
28-
suffix = '.sha256'
29-
sha_url = url + suffix
28+
def get(base, url, path, checksums, verbose=False, do_verify=True):
3029
with tempfile.NamedTemporaryFile(delete=False) as temp_file:
3130
temp_path = temp_file.name
32-
with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as sha_file:
33-
sha_path = sha_file.name
3431

3532
try:
3633
if do_verify:
37-
download(sha_path, sha_url, False, verbose)
34+
if url not in checksums:
35+
raise RuntimeError("src/stage0.json doesn't contain a checksum for {}".format(url))
36+
sha256 = checksums[url]
3837
if os.path.exists(path):
39-
if verify(path, sha_path, False):
38+
if verify(path, sha256, False):
4039
if verbose:
4140
print("using already-download file", path)
4241
return
@@ -45,23 +44,17 @@ def get(url, path, verbose=False, do_verify=True):
4544
print("ignoring already-download file",
4645
path, "due to failed verification")
4746
os.unlink(path)
48-
download(temp_path, url, True, verbose)
49-
if do_verify and not verify(temp_path, sha_path, verbose):
47+
download(temp_path, "{}/{}".format(base, url), True, verbose)
48+
if do_verify and not verify(temp_path, sha256, verbose):
5049
raise RuntimeError("failed verification")
5150
if verbose:
5251
print("moving {} to {}".format(temp_path, path))
5352
shutil.move(temp_path, path)
5453
finally:
55-
delete_if_present(sha_path, verbose)
56-
delete_if_present(temp_path, verbose)
57-
58-
59-
def delete_if_present(path, verbose):
60-
"""Remove the given file if present"""
61-
if os.path.isfile(path):
62-
if verbose:
63-
print("removing", path)
64-
os.unlink(path)
54+
if os.path.isfile(temp_path):
55+
if verbose:
56+
print("removing", temp_path)
57+
os.unlink(temp_path)
6558

6659

6760
def download(path, url, probably_big, verbose):
@@ -98,14 +91,12 @@ def _download(path, url, probably_big, verbose, exception):
9891
exception=exception)
9992

10093

101-
def verify(path, sha_path, verbose):
94+
def verify(path, expected, verbose):
10295
"""Check if the sha256 sum of the given path is valid"""
10396
if verbose:
10497
print("verifying", path)
10598
with open(path, "rb") as source:
10699
found = hashlib.sha256(source.read()).hexdigest()
107-
with open(sha_path, "r") as sha256sum:
108-
expected = sha256sum.readline().split()[0]
109100
verified = found == expected
110101
if not verified:
111102
print("invalid checksum:\n"
@@ -176,15 +167,6 @@ def require(cmd, exit=True):
176167
sys.exit(1)
177168

178169

179-
def stage0_data(rust_root):
180-
"""Build a dictionary from stage0.txt"""
181-
nightlies = os.path.join(rust_root, "src/stage0.txt")
182-
with open(nightlies, 'r') as nightlies:
183-
lines = [line.rstrip() for line in nightlies
184-
if not line.startswith("#")]
185-
return dict([line.split(": ", 1) for line in lines if line])
186-
187-
188170
def format_build_time(duration):
189171
"""Return a nicer format for build time
190172
@@ -372,13 +354,22 @@ def output(filepath):
372354
os.rename(tmp, filepath)
373355

374356

357+
class Stage0Toolchain:
358+
def __init__(self, stage0_payload):
359+
self.date = stage0_payload["date"]
360+
self.version = stage0_payload["version"]
361+
362+
def channel(self):
363+
return self.version + "-" + self.date
364+
365+
375366
class RustBuild(object):
376367
"""Provide all the methods required to build Rust"""
377368
def __init__(self):
378-
self.date = ''
369+
self.checksums_sha256 = {}
370+
self.stage0_compiler = None
371+
self.stage0_rustfmt = None
379372
self._download_url = ''
380-
self.rustc_channel = ''
381-
self.rustfmt_channel = ''
382373
self.build = ''
383374
self.build_dir = ''
384375
self.clean = False
@@ -402,11 +393,10 @@ def download_toolchain(self, stage0=True, rustc_channel=None):
402393
will move all the content to the right place.
403394
"""
404395
if rustc_channel is None:
405-
rustc_channel = self.rustc_channel
406-
rustfmt_channel = self.rustfmt_channel
396+
rustc_channel = self.stage0_compiler.version
407397
bin_root = self.bin_root(stage0)
408398

409-
key = self.date
399+
key = self.stage0_compiler.date
410400
if not stage0:
411401
key += str(self.rustc_commit)
412402
if self.rustc(stage0).startswith(bin_root) and \
@@ -445,19 +435,23 @@ def download_toolchain(self, stage0=True, rustc_channel=None):
445435

446436
if self.rustfmt() and self.rustfmt().startswith(bin_root) and (
447437
not os.path.exists(self.rustfmt())
448-
or self.program_out_of_date(self.rustfmt_stamp(), self.rustfmt_channel)
438+
or self.program_out_of_date(
439+
self.rustfmt_stamp(),
440+
"" if self.stage0_rustfmt is None else self.stage0_rustfmt.channel()
441+
)
449442
):
450-
if rustfmt_channel:
443+
if self.stage0_rustfmt is not None:
451444
tarball_suffix = '.tar.xz' if support_xz() else '.tar.gz'
452-
[channel, date] = rustfmt_channel.split('-', 1)
453-
filename = "rustfmt-{}-{}{}".format(channel, self.build, tarball_suffix)
445+
filename = "rustfmt-{}-{}{}".format(
446+
self.stage0_rustfmt.version, self.build, tarball_suffix,
447+
)
454448
self._download_component_helper(
455-
filename, "rustfmt-preview", tarball_suffix, key=date
449+
filename, "rustfmt-preview", tarball_suffix, key=self.stage0_rustfmt.date
456450
)
457451
self.fix_bin_or_dylib("{}/bin/rustfmt".format(bin_root))
458452
self.fix_bin_or_dylib("{}/bin/cargo-fmt".format(bin_root))
459453
with output(self.rustfmt_stamp()) as rustfmt_stamp:
460-
rustfmt_stamp.write(self.rustfmt_channel)
454+
rustfmt_stamp.write(self.stage0_rustfmt.channel())
461455

462456
# Avoid downloading LLVM twice (once for stage0 and once for the master rustc)
463457
if self.downloading_llvm() and stage0:
@@ -518,7 +512,7 @@ def _download_component_helper(
518512
):
519513
if key is None:
520514
if stage0:
521-
key = self.date
515+
key = self.stage0_compiler.date
522516
else:
523517
key = self.rustc_commit
524518
cache_dst = os.path.join(self.build_dir, "cache")
@@ -527,12 +521,21 @@ def _download_component_helper(
527521
os.makedirs(rustc_cache)
528522

529523
if stage0:
530-
url = "{}/dist/{}".format(self._download_url, key)
524+
base = self._download_url
525+
url = "dist/{}".format(key)
531526
else:
532-
url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(self.rustc_commit)
527+
base = "https://ci-artifacts.rust-lang.org"
528+
url = "rustc-builds/{}".format(self.rustc_commit)
533529
tarball = os.path.join(rustc_cache, filename)
534530
if not os.path.exists(tarball):
535-
get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=stage0)
531+
get(
532+
base,
533+
"{}/{}".format(url, filename),
534+
tarball,
535+
self.checksums_sha256,
536+
verbose=self.verbose,
537+
do_verify=stage0,
538+
)
536539
unpack(tarball, tarball_suffix, self.bin_root(stage0), match=pattern, verbose=self.verbose)
537540

538541
def _download_ci_llvm(self, llvm_sha, llvm_assertions):
@@ -542,7 +545,8 @@ def _download_ci_llvm(self, llvm_sha, llvm_assertions):
542545
if not os.path.exists(rustc_cache):
543546
os.makedirs(rustc_cache)
544547

545-
url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(llvm_sha)
548+
base = "https://ci-artifacts.rust-lang.org"
549+
url = "rustc-builds/{}".format(llvm_sha)
546550
if llvm_assertions:
547551
url = url.replace('rustc-builds', 'rustc-builds-alt')
548552
# ci-artifacts are only stored as .xz, not .gz
@@ -554,7 +558,14 @@ def _download_ci_llvm(self, llvm_sha, llvm_assertions):
554558
filename = "rust-dev-nightly-" + self.build + tarball_suffix
555559
tarball = os.path.join(rustc_cache, filename)
556560
if not os.path.exists(tarball):
557-
get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=False)
561+
get(
562+
base,
563+
"{}/{}".format(url, filename),
564+
tarball,
565+
self.checksums_sha256,
566+
verbose=self.verbose,
567+
do_verify=False,
568+
)
558569
unpack(tarball, tarball_suffix, self.llvm_root(),
559570
match="rust-dev",
560571
verbose=self.verbose)
@@ -816,7 +827,7 @@ def rustc(self, stage0):
816827

817828
def rustfmt(self):
818829
"""Return config path for rustfmt"""
819-
if not self.rustfmt_channel:
830+
if self.stage0_rustfmt is None:
820831
return None
821832
return self.program_config('rustfmt')
822833

@@ -1040,19 +1051,12 @@ def update_submodules(self):
10401051
self.update_submodule(module[0], module[1], recorded_submodules)
10411052
print("Submodules updated in %.2f seconds" % (time() - start_time))
10421053

1043-
def set_normal_environment(self):
1054+
def set_dist_environment(self, url):
10441055
"""Set download URL for normal environment"""
10451056
if 'RUSTUP_DIST_SERVER' in os.environ:
10461057
self._download_url = os.environ['RUSTUP_DIST_SERVER']
10471058
else:
1048-
self._download_url = 'https://static.rust-lang.org'
1049-
1050-
def set_dev_environment(self):
1051-
"""Set download URL for development environment"""
1052-
if 'RUSTUP_DEV_DIST_SERVER' in os.environ:
1053-
self._download_url = os.environ['RUSTUP_DEV_DIST_SERVER']
1054-
else:
1055-
self._download_url = 'https://dev-static.rust-lang.org'
1059+
self._download_url = url
10561060

10571061
def check_vendored_status(self):
10581062
"""Check that vendoring is configured properly"""
@@ -1161,17 +1165,14 @@ def bootstrap(help_triggered):
11611165
build_dir = build.get_toml('build-dir', 'build') or 'build'
11621166
build.build_dir = os.path.abspath(build_dir.replace("$ROOT", build.rust_root))
11631167

1164-
data = stage0_data(build.rust_root)
1165-
build.date = data['date']
1166-
build.rustc_channel = data['rustc']
1167-
1168-
if "rustfmt" in data:
1169-
build.rustfmt_channel = data['rustfmt']
1168+
with open(os.path.join(build.rust_root, "src", "stage0.json")) as f:
1169+
data = json.load(f)
1170+
build.checksums_sha256 = data["checksums_sha256"]
1171+
build.stage0_compiler = Stage0Toolchain(data["compiler"])
1172+
if data.get("rustfmt") is not None:
1173+
build.stage0_rustfmt = Stage0Toolchain(data["rustfmt"])
11701174

1171-
if 'dev' in data:
1172-
build.set_dev_environment()
1173-
else:
1174-
build.set_normal_environment()
1175+
build.set_dist_environment(data["dist_server"])
11751176

11761177
build.build = args.build or build.build_triple()
11771178
build.update_submodules()

src/bootstrap/bootstrap_test.py

+4-25
Original file line numberDiff line numberDiff line change
@@ -13,38 +13,18 @@
1313
import bootstrap
1414

1515

16-
class Stage0DataTestCase(unittest.TestCase):
17-
"""Test Case for stage0_data"""
18-
def setUp(self):
19-
self.rust_root = tempfile.mkdtemp()
20-
os.mkdir(os.path.join(self.rust_root, "src"))
21-
with open(os.path.join(self.rust_root, "src",
22-
"stage0.txt"), "w") as stage0:
23-
stage0.write("#ignore\n\ndate: 2017-06-15\nrustc: beta\ncargo: beta\nrustfmt: beta")
24-
25-
def tearDown(self):
26-
rmtree(self.rust_root)
27-
28-
def test_stage0_data(self):
29-
"""Extract data from stage0.txt"""
30-
expected = {"date": "2017-06-15", "rustc": "beta", "cargo": "beta", "rustfmt": "beta"}
31-
data = bootstrap.stage0_data(self.rust_root)
32-
self.assertDictEqual(data, expected)
33-
34-
3516
class VerifyTestCase(unittest.TestCase):
3617
"""Test Case for verify"""
3718
def setUp(self):
3819
self.container = tempfile.mkdtemp()
3920
self.src = os.path.join(self.container, "src.txt")
40-
self.sums = os.path.join(self.container, "sums")
4121
self.bad_src = os.path.join(self.container, "bad.txt")
4222
content = "Hello world"
4323

24+
self.expected = hashlib.sha256(content.encode("utf-8")).hexdigest()
25+
4426
with open(self.src, "w") as src:
4527
src.write(content)
46-
with open(self.sums, "w") as sums:
47-
sums.write(hashlib.sha256(content.encode("utf-8")).hexdigest())
4828
with open(self.bad_src, "w") as bad:
4929
bad.write("Hello!")
5030

@@ -53,11 +33,11 @@ def tearDown(self):
5333

5434
def test_valid_file(self):
5535
"""Check if the sha256 sum of the given file is valid"""
56-
self.assertTrue(bootstrap.verify(self.src, self.sums, False))
36+
self.assertTrue(bootstrap.verify(self.src, self.expected, False))
5737

5838
def test_invalid_file(self):
5939
"""Should verify that the file is invalid"""
60-
self.assertFalse(bootstrap.verify(self.bad_src, self.sums, False))
40+
self.assertFalse(bootstrap.verify(self.bad_src, self.expected, False))
6141

6242

6343
class ProgramOutOfDate(unittest.TestCase):
@@ -99,7 +79,6 @@ def test_same_dates(self):
9979
TEST_LOADER = unittest.TestLoader()
10080
SUITE.addTest(doctest.DocTestSuite(bootstrap))
10181
SUITE.addTests([
102-
TEST_LOADER.loadTestsFromTestCase(Stage0DataTestCase),
10382
TEST_LOADER.loadTestsFromTestCase(VerifyTestCase),
10483
TEST_LOADER.loadTestsFromTestCase(ProgramOutOfDate)])
10584

src/bootstrap/builder.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -523,7 +523,7 @@ impl<'a> Builder<'a> {
523523
install::Src,
524524
install::Rustc
525525
),
526-
Kind::Run => describe!(run::ExpandYamlAnchors, run::BuildManifest),
526+
Kind::Run => describe!(run::ExpandYamlAnchors, run::BuildManifest, run::BumpStage0),
527527
}
528528
}
529529

src/bootstrap/lib.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
//! When you execute `x.py build`, the steps executed are:
3232
//!
3333
//! * First, the python script is run. This will automatically download the
34-
//! stage0 rustc and cargo according to `src/stage0.txt`, or use the cached
34+
//! stage0 rustc and cargo according to `src/stage0.json`, or use the cached
3535
//! versions if they're available. These are then used to compile rustbuild
3636
//! itself (using Cargo). Finally, control is then transferred to rustbuild.
3737
//!

0 commit comments

Comments
 (0)