Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

PVF: add landlock sandboxing #7303

Merged
merged 19 commits into from
Jul 5, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 27 additions & 5 deletions node/core/pvf/common/src/worker/security.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,18 +35,36 @@ pub mod landlock {

use landlock::{Access, AccessFs, Ruleset, RulesetAttr, RulesetError, ABI};

/// Version of landlock ABI. Use the latest version supported by our reference kernel version.
/// Landlock ABI version. Use the latest version supported by our reference kernel version.
///
/// # Versions (June 2023)
///
/// - Reference kernel version: 5.16+
/// - ABI V1: 5.13
/// - ABI V2: 5.19
///
/// Please update these docs if they are out-of-date, and bump the ABI version to the minimum
/// Please update these values if they are out-of-date, and bump the ABI version to the minimum
/// ABI supported by the reference kernel version.
///
/// # Determinism
///
/// You may wonder whether we could always use the latest ABI instead of the ABI supported by
/// the reference kernel version. It seems plausible, since landlock provides a best-effort
/// approach to enabling sandboxing. For example, if the reference version only supported V1 and
/// we were on V2, then landlock would use V2 if it was supported on the current machine, and
/// just fall back to V1 if not.
///
/// The issue with this is indeterminacy. If half of validators were on V2 and half were on V1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't parse well for me, if V1 is less restrictive and makes validators vulnerable then I would guess having just a part of them vulnerable is better than all of them isn't it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question and I wondered the same. The issue is that it opens up a possibility of different behavior on different validators, and this itself can be exploited to attack consensus. Say that in the future we enable ABI V3 and executing some PVF tries to truncate a file (which will be banned on ABI V3)1 - some validators will error and some won't. If the split in voting among validators is less than 33%, there will be a dispute and all the losing validators will get slashed. If the split is more than 33%, it violates our assumptions about consensus and finality will stall.

So indeed there is a very interesting trade-off between security for the validator and security for the chain, and I think we have to prioritize the latter while providing as much validator security as possible. If a small amount of validators are behind in security and vote wrongly then some slashing is okay, and it can be reverted with governance, but I think we really don't want finality stalls.

Although, I guess it would also be really bad if a bunch of validator keys got stolen and an attacker started impersonating them... And anyway there are other sources of indeterminacy to attack the chain with... Fortunately ABI V1 already fully protects against reading unauthorized data, so in this case it is enough to protect validators' keys and it is still the correct decision. (The only other thing I would want to feel safe is to block all network access. Maybe it's possible to set up a firewall per-thread?)

There are similar considerations that made the seccomp sandboxing harder than anticipated. Maybe @eskimor can double-check my analysis.

Footnotes

  1. Using V3 in this example because V2 doesn't actually provide additional restrictions on top of V1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation, makes sense now! I guess V1 is the best we can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the default for validators to have these security measures in place, ideally we would have none without them. Anyhow, the risk of disputes should be very low as this is already a second level defense mechanism. I would rather have a dispute than some PVF being able to read the validator's disk. We should make damn sure that there are no legitimate disk accesses of course, but checking that should be independent of PVF or candidate, so also rather easy. At least at the moment, I can't think of a legitimate disk access that would only happen on some PVFs ..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eskimor. Determinism is still a goal, and given that ABI v2 and v3 don't add to the security I would stick to v1 here.1 I will update the docs as the determinacy is still relevant but not the only factor. And if in the future a version is released with meaningful new features like network blocking (which is an eventual goal of landlock) we can enable it immediately. We should keep in mind that attackers can exploit any difference in operation to attack the chain, but the risk is low and there are other indeterminacy vectors anyway.

Footnotes

  1. v2 adds a new config option which we don't use, and v3 additionally blocks file truncation - which might be annoying, but not really critical to security, and validators should have backups, right?

/// they may have different semantics on some PVFs. So a malicious PVF now has a new attack
/// vector: they can exploit this indeterminism between landlock ABIs! But this is exactly the
/// kind of thing we want to prevent! So, we have to stick only to the latest ABI supported by
/// our reference kernel version (in this example ABI V1). And if a validator's machine does not
/// *fully* support it, we can't let them run as a secure validator.
pub const LANDLOCK_ABI: ABI = ABI::V1;

// TODO: <https://github.com/landlock-lsm/rust-landlock/issues/36>
/// Returns to what degree landlock is enabled on the current Linux environment.
/// Returns to what degree landlock is enabled with the given ABI on the current Linux
/// environment.
pub fn get_status() -> Result<RulesetStatus, Box<dyn std::error::Error>> {
match std::thread::spawn(|| try_restrict_thread()).join() {
Ok(Ok(status)) => Ok(status),
Expand All @@ -55,8 +73,12 @@ pub mod landlock {
}
}

/// Returns a single bool indicating whether landlock is fully enabled on the current Linux
/// environment.
/// Returns a single bool indicating whether the given landlock ABI is fully enabled on the
/// current Linux environment.
///
/// NOTE: Secure validators *should* have this *fully* enabled for the reference ABI. Even having
/// this partially enabled (which landlock does in a best-effort capacity) may lead to
/// indeterminism. See "Determinism" under [`LANDLOCK_ABI`].
pub fn is_fully_enabled() -> bool {
matches!(get_status(), Ok(RulesetStatus::FullyEnforced))
}
Expand Down