Skip to content

Commit

Permalink
Auto merge of #88362 - pietroalbini:bump-stage0, r=Mark-Simulacrum
Browse files Browse the repository at this point in the history
Pin bootstrap checksums and add a tool to update it automatically

:warning: :warning: This is just a proactive hardening we're performing on the build system, and it's not prompted by any known compromise. If you're aware of security issues being exploited please [check out our responsible disclosure page](https://www.rust-lang.org/policies/security). ⚠️ ⚠️

---

This PR aims to improve Rust's supply chain security by pinning the checksums of the bootstrap compiler downloaded by `x.py`, preventing a compromised `static.rust-lang.org` from affecting building the compiler. The checksums are stored in `src/stage0.json`, which replaces `src/stage0.txt`. This PR also adds a tool to automatically update the bootstrap compiler.

The changes in this PR were originally discussed in [Zulip](https://zulip-archive.rust-lang.org/stream/241545-t-release/topic/pinning.20stage0.20hashes.html).

## Potential attack

Before this PR, an attacker who wanted to compromise the bootstrap compiler would "just" need to:

1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage.
2. Upload compromised binaries and corresponding `.sha256` files to `static.rust-lang.org`.

There is no signature verification in `x.py` as we don't want the build system to depend on GPG. Also, since the checksums were not pinned inside the repository, they were downloaded from `static.rust-lang.org` too: this only protected from accidental changes in `static.rust-lang.org` that didn't change the `*.sha256` files. The attack would allow the attacker to compromise past and future invocations of `x.py`.

## Mitigations introduced in this PR

This PR adds pinned checksums for all the bootstrap components in `src/stage0.json` instead of downloading the checksums from `static.rust-lang.org`. This changes the attack scenario to:

1. Gain write access to `static.rust-lang.org`, either by compromising DNS or the underlying storage.
2. Upload compromised binaries to `static.rust-lang.org`.
3. Land a (reviewed) change in the `rust-lang/rust` repository changing the pinned hashes.

Even with a successful attack, existing clones of the Rust repository won't be affected, and once the attack is detected reverting the pinned hashes changes should be enough to be protected from the attack. This also enables further mitigations to be implemented in following PRs, such as verifying signatures when pinning new checksums (removing the trust on first use aspect of this PR) and adding a check in CI making sure a PR updating the checksum has not been tampered with (see the future improvements section).

## Additional changes

There are additional changes implemented in this PR to enable the mitigation:

* The `src/stage0.txt` file has been replaced with `src/stage0.json`. The reasoning for the change is that there is existing tooling to read and manipulate JSON files compared to the custom format we were using before, and the slight challenge of manually editing JSON files (no comments, no trailing commas) are not a problem thanks to the new `bump-stage0`.

* A new tool has been added to the repository, `bump-stage0`. When invoked, the tool automatically calculates which release should be used as the bootstrap compiler given the current version and channel, gathers all the relevant checksums and updates `src/stage0.json`. The tool can be invoked by running:

  ```
  ./x.py run src/tools/bump-stage0
  ```

* Support for downloading releases from `https://dev-static.rust-lang.org` has been removed, as it's not possible to verify checksums there (it's customary to replace existing artifacts there if a rebuild is warranted). This will require a change to the release process to avoid bumping the bootstrap compiler on beta before the stable release.

## Future improvements

* Add signature verification as part of `bump-stage0`, which would require the attacker to also obtain the release signing keys in order to successfully compromise the bootstrap compiler. This would be fine to add now, as the burden of installing the tool to verify signatures would only be placed on whoever updates the bootstrap compiler, instead of everyone compiling Rust.

* Add a check on CI that ensures the checksums in `src/stage0.json` are the expected ones. If a PR changes the stage0 file CI should also run the `bump-stage0` tool and fail if the output in CI doesn't match the committed file. This prevents the PR author from tweaking the output of the tool manually, which would otherwise be close to impossible for a human to detect.

* Automate creating the PRs bumping the bootstrap compiler, by setting up a scheduled job in GitHub Actions that runs the tool and opens a PR.

* Investigate whether a similar mitigation can be done for "download from CI" components like the prebuilt LLVM.

r? `@Mark-Simulacrum`
  • Loading branch information
bors committed Sep 6, 2021
2 parents 13db844 + ea8b1ff commit 8ceea01
Show file tree
Hide file tree
Showing 13 changed files with 677 additions and 149 deletions.
13 changes: 13 additions & 0 deletions Cargo.lock
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,18 @@ dependencies = [
name = "build_helper"
version = "0.1.0"

[[package]]
name = "bump-stage0"
version = "0.1.0"
dependencies = [
"anyhow",
"curl",
"indexmap",
"serde",
"serde_json",
"toml",
]

[[package]]
name = "byte-tools"
version = "0.3.1"
Expand Down Expand Up @@ -1663,6 +1675,7 @@ checksum = "bc633605454125dec4b66843673f01c7df2b89479b32e0ed634e43a91cff62a5"
dependencies = [
"autocfg",
"hashbrown",
"serde",
]

[[package]]
Expand Down
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ members = [
"src/tools/expand-yaml-anchors",
"src/tools/jsondocck",
"src/tools/html-checker",
"src/tools/bump-stage0",
]

exclude = [
Expand Down
139 changes: 70 additions & 69 deletions src/bootstrap/bootstrap.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import datetime
import distutils.version
import hashlib
import json
import os
import re
import shutil
Expand All @@ -24,19 +25,17 @@ def support_xz():
except tarfile.CompressionError:
return False

def get(url, path, verbose=False, do_verify=True):
suffix = '.sha256'
sha_url = url + suffix
def get(base, url, path, checksums, verbose=False, do_verify=True):
with tempfile.NamedTemporaryFile(delete=False) as temp_file:
temp_path = temp_file.name
with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as sha_file:
sha_path = sha_file.name

try:
if do_verify:
download(sha_path, sha_url, False, verbose)
if url not in checksums:
raise RuntimeError("src/stage0.json doesn't contain a checksum for {}".format(url))
sha256 = checksums[url]
if os.path.exists(path):
if verify(path, sha_path, False):
if verify(path, sha256, False):
if verbose:
print("using already-download file", path)
return
Expand All @@ -45,23 +44,17 @@ def get(url, path, verbose=False, do_verify=True):
print("ignoring already-download file",
path, "due to failed verification")
os.unlink(path)
download(temp_path, url, True, verbose)
if do_verify and not verify(temp_path, sha_path, verbose):
download(temp_path, "{}/{}".format(base, url), True, verbose)
if do_verify and not verify(temp_path, sha256, verbose):
raise RuntimeError("failed verification")
if verbose:
print("moving {} to {}".format(temp_path, path))
shutil.move(temp_path, path)
finally:
delete_if_present(sha_path, verbose)
delete_if_present(temp_path, verbose)


def delete_if_present(path, verbose):
"""Remove the given file if present"""
if os.path.isfile(path):
if verbose:
print("removing", path)
os.unlink(path)
if os.path.isfile(temp_path):
if verbose:
print("removing", temp_path)
os.unlink(temp_path)


def download(path, url, probably_big, verbose):
Expand Down Expand Up @@ -98,14 +91,12 @@ def _download(path, url, probably_big, verbose, exception):
exception=exception)


def verify(path, sha_path, verbose):
def verify(path, expected, verbose):
"""Check if the sha256 sum of the given path is valid"""
if verbose:
print("verifying", path)
with open(path, "rb") as source:
found = hashlib.sha256(source.read()).hexdigest()
with open(sha_path, "r") as sha256sum:
expected = sha256sum.readline().split()[0]
verified = found == expected
if not verified:
print("invalid checksum:\n"
Expand Down Expand Up @@ -176,15 +167,6 @@ def require(cmd, exit=True):
sys.exit(1)


def stage0_data(rust_root):
"""Build a dictionary from stage0.txt"""
nightlies = os.path.join(rust_root, "src/stage0.txt")
with open(nightlies, 'r') as nightlies:
lines = [line.rstrip() for line in nightlies
if not line.startswith("#")]
return dict([line.split(": ", 1) for line in lines if line])


def format_build_time(duration):
"""Return a nicer format for build time
Expand Down Expand Up @@ -372,13 +354,22 @@ def output(filepath):
os.rename(tmp, filepath)


class Stage0Toolchain:
def __init__(self, stage0_payload):
self.date = stage0_payload["date"]
self.version = stage0_payload["version"]

def channel(self):
return self.version + "-" + self.date


class RustBuild(object):
"""Provide all the methods required to build Rust"""
def __init__(self):
self.date = ''
self.checksums_sha256 = {}
self.stage0_compiler = None
self.stage0_rustfmt = None
self._download_url = ''
self.rustc_channel = ''
self.rustfmt_channel = ''
self.build = ''
self.build_dir = ''
self.clean = False
Expand All @@ -402,11 +393,10 @@ def download_toolchain(self, stage0=True, rustc_channel=None):
will move all the content to the right place.
"""
if rustc_channel is None:
rustc_channel = self.rustc_channel
rustfmt_channel = self.rustfmt_channel
rustc_channel = self.stage0_compiler.version
bin_root = self.bin_root(stage0)

key = self.date
key = self.stage0_compiler.date
if not stage0:
key += str(self.rustc_commit)
if self.rustc(stage0).startswith(bin_root) and \
Expand Down Expand Up @@ -445,19 +435,23 @@ def download_toolchain(self, stage0=True, rustc_channel=None):

if self.rustfmt() and self.rustfmt().startswith(bin_root) and (
not os.path.exists(self.rustfmt())
or self.program_out_of_date(self.rustfmt_stamp(), self.rustfmt_channel)
or self.program_out_of_date(
self.rustfmt_stamp(),
"" if self.stage0_rustfmt is None else self.stage0_rustfmt.channel()
)
):
if rustfmt_channel:
if self.stage0_rustfmt is not None:
tarball_suffix = '.tar.xz' if support_xz() else '.tar.gz'
[channel, date] = rustfmt_channel.split('-', 1)
filename = "rustfmt-{}-{}{}".format(channel, self.build, tarball_suffix)
filename = "rustfmt-{}-{}{}".format(
self.stage0_rustfmt.version, self.build, tarball_suffix,
)
self._download_component_helper(
filename, "rustfmt-preview", tarball_suffix, key=date
filename, "rustfmt-preview", tarball_suffix, key=self.stage0_rustfmt.date
)
self.fix_bin_or_dylib("{}/bin/rustfmt".format(bin_root))
self.fix_bin_or_dylib("{}/bin/cargo-fmt".format(bin_root))
with output(self.rustfmt_stamp()) as rustfmt_stamp:
rustfmt_stamp.write(self.rustfmt_channel)
rustfmt_stamp.write(self.stage0_rustfmt.channel())

# Avoid downloading LLVM twice (once for stage0 and once for the master rustc)
if self.downloading_llvm() and stage0:
Expand Down Expand Up @@ -518,7 +512,7 @@ def _download_component_helper(
):
if key is None:
if stage0:
key = self.date
key = self.stage0_compiler.date
else:
key = self.rustc_commit
cache_dst = os.path.join(self.build_dir, "cache")
Expand All @@ -527,12 +521,21 @@ def _download_component_helper(
os.makedirs(rustc_cache)

if stage0:
url = "{}/dist/{}".format(self._download_url, key)
base = self._download_url
url = "dist/{}".format(key)
else:
url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(self.rustc_commit)
base = "https://ci-artifacts.rust-lang.org"
url = "rustc-builds/{}".format(self.rustc_commit)
tarball = os.path.join(rustc_cache, filename)
if not os.path.exists(tarball):
get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=stage0)
get(
base,
"{}/{}".format(url, filename),
tarball,
self.checksums_sha256,
verbose=self.verbose,
do_verify=stage0,
)
unpack(tarball, tarball_suffix, self.bin_root(stage0), match=pattern, verbose=self.verbose)

def _download_ci_llvm(self, llvm_sha, llvm_assertions):
Expand All @@ -542,7 +545,8 @@ def _download_ci_llvm(self, llvm_sha, llvm_assertions):
if not os.path.exists(rustc_cache):
os.makedirs(rustc_cache)

url = "https://ci-artifacts.rust-lang.org/rustc-builds/{}".format(llvm_sha)
base = "https://ci-artifacts.rust-lang.org"
url = "rustc-builds/{}".format(llvm_sha)
if llvm_assertions:
url = url.replace('rustc-builds', 'rustc-builds-alt')
# ci-artifacts are only stored as .xz, not .gz
Expand All @@ -554,7 +558,14 @@ def _download_ci_llvm(self, llvm_sha, llvm_assertions):
filename = "rust-dev-nightly-" + self.build + tarball_suffix
tarball = os.path.join(rustc_cache, filename)
if not os.path.exists(tarball):
get("{}/{}".format(url, filename), tarball, verbose=self.verbose, do_verify=False)
get(
base,
"{}/{}".format(url, filename),
tarball,
self.checksums_sha256,
verbose=self.verbose,
do_verify=False,
)
unpack(tarball, tarball_suffix, self.llvm_root(),
match="rust-dev",
verbose=self.verbose)
Expand Down Expand Up @@ -816,7 +827,7 @@ def rustc(self, stage0):

def rustfmt(self):
"""Return config path for rustfmt"""
if not self.rustfmt_channel:
if self.stage0_rustfmt is None:
return None
return self.program_config('rustfmt')

Expand Down Expand Up @@ -1040,19 +1051,12 @@ def update_submodules(self):
self.update_submodule(module[0], module[1], recorded_submodules)
print("Submodules updated in %.2f seconds" % (time() - start_time))

def set_normal_environment(self):
def set_dist_environment(self, url):
"""Set download URL for normal environment"""
if 'RUSTUP_DIST_SERVER' in os.environ:
self._download_url = os.environ['RUSTUP_DIST_SERVER']
else:
self._download_url = 'https://static.rust-lang.org'

def set_dev_environment(self):
"""Set download URL for development environment"""
if 'RUSTUP_DEV_DIST_SERVER' in os.environ:
self._download_url = os.environ['RUSTUP_DEV_DIST_SERVER']
else:
self._download_url = 'https://dev-static.rust-lang.org'
self._download_url = url

def check_vendored_status(self):
"""Check that vendoring is configured properly"""
Expand Down Expand Up @@ -1161,17 +1165,14 @@ def bootstrap(help_triggered):
build_dir = build.get_toml('build-dir', 'build') or 'build'
build.build_dir = os.path.abspath(build_dir.replace("$ROOT", build.rust_root))

data = stage0_data(build.rust_root)
build.date = data['date']
build.rustc_channel = data['rustc']

if "rustfmt" in data:
build.rustfmt_channel = data['rustfmt']
with open(os.path.join(build.rust_root, "src", "stage0.json")) as f:
data = json.load(f)
build.checksums_sha256 = data["checksums_sha256"]
build.stage0_compiler = Stage0Toolchain(data["compiler"])
if data.get("rustfmt") is not None:
build.stage0_rustfmt = Stage0Toolchain(data["rustfmt"])

if 'dev' in data:
build.set_dev_environment()
else:
build.set_normal_environment()
build.set_dist_environment(data["dist_server"])

build.build = args.build or build.build_triple()
build.update_submodules()
Expand Down
29 changes: 4 additions & 25 deletions src/bootstrap/bootstrap_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,38 +13,18 @@
import bootstrap


class Stage0DataTestCase(unittest.TestCase):
"""Test Case for stage0_data"""
def setUp(self):
self.rust_root = tempfile.mkdtemp()
os.mkdir(os.path.join(self.rust_root, "src"))
with open(os.path.join(self.rust_root, "src",
"stage0.txt"), "w") as stage0:
stage0.write("#ignore\n\ndate: 2017-06-15\nrustc: beta\ncargo: beta\nrustfmt: beta")

def tearDown(self):
rmtree(self.rust_root)

def test_stage0_data(self):
"""Extract data from stage0.txt"""
expected = {"date": "2017-06-15", "rustc": "beta", "cargo": "beta", "rustfmt": "beta"}
data = bootstrap.stage0_data(self.rust_root)
self.assertDictEqual(data, expected)


class VerifyTestCase(unittest.TestCase):
"""Test Case for verify"""
def setUp(self):
self.container = tempfile.mkdtemp()
self.src = os.path.join(self.container, "src.txt")
self.sums = os.path.join(self.container, "sums")
self.bad_src = os.path.join(self.container, "bad.txt")
content = "Hello world"

self.expected = hashlib.sha256(content.encode("utf-8")).hexdigest()

with open(self.src, "w") as src:
src.write(content)
with open(self.sums, "w") as sums:
sums.write(hashlib.sha256(content.encode("utf-8")).hexdigest())
with open(self.bad_src, "w") as bad:
bad.write("Hello!")

Expand All @@ -53,11 +33,11 @@ def tearDown(self):

def test_valid_file(self):
"""Check if the sha256 sum of the given file is valid"""
self.assertTrue(bootstrap.verify(self.src, self.sums, False))
self.assertTrue(bootstrap.verify(self.src, self.expected, False))

def test_invalid_file(self):
"""Should verify that the file is invalid"""
self.assertFalse(bootstrap.verify(self.bad_src, self.sums, False))
self.assertFalse(bootstrap.verify(self.bad_src, self.expected, False))


class ProgramOutOfDate(unittest.TestCase):
Expand Down Expand Up @@ -99,7 +79,6 @@ def test_same_dates(self):
TEST_LOADER = unittest.TestLoader()
SUITE.addTest(doctest.DocTestSuite(bootstrap))
SUITE.addTests([
TEST_LOADER.loadTestsFromTestCase(Stage0DataTestCase),
TEST_LOADER.loadTestsFromTestCase(VerifyTestCase),
TEST_LOADER.loadTestsFromTestCase(ProgramOutOfDate)])

Expand Down
2 changes: 1 addition & 1 deletion src/bootstrap/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -523,7 +523,7 @@ impl<'a> Builder<'a> {
install::Src,
install::Rustc
),
Kind::Run => describe!(run::ExpandYamlAnchors, run::BuildManifest),
Kind::Run => describe!(run::ExpandYamlAnchors, run::BuildManifest, run::BumpStage0),
}
}

Expand Down
2 changes: 1 addition & 1 deletion src/bootstrap/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
//! When you execute `x.py build`, the steps executed are:
//!
//! * First, the python script is run. This will automatically download the
//! stage0 rustc and cargo according to `src/stage0.txt`, or use the cached
//! stage0 rustc and cargo according to `src/stage0.json`, or use the cached
//! versions if they're available. These are then used to compile rustbuild
//! itself (using Cargo). Finally, control is then transferred to rustbuild.
//!
Expand Down
Loading

0 comments on commit 8ceea01

Please sign in to comment.