Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encryption key providers in general and keylocation=exec:// specifically #11731

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nabijaczleweli
Copy link
Contributor

@nabijaczleweli nabijaczleweli commented Mar 11, 2021

Motivation and Context

I previously babbled about this at length in #11707, but: to do anything past "password" or "file", you need a keylocation=prompt+keyformat=raw, a custom toolchain reimplementing parts of the zfs(8) API badly, and to essentially hijack the boot scripts. This Blows Massively™.

Description

See manpage and individual commit messages for specifics, but an abstract machine for reliably providing/generating/loading keys via arbitrary helpers is defined alongside the frame-work for doing so via executables. It could be trivially extended for use with, e.g., unix-domain sockets.

I'm not sure about the many localised error strings (and errnos in the wait status handling) in execute_key_provider(). I assume someone will have better ones.

How Has This Been Tested?

I ran it and non-helper friends into every induced fault I could think of (and a few I couldn't).

Types of changes

  • Bug fix (non-breaking change which fixes an issue) – the cloexec thing
  • New feature (non-breaking change which adds functionality)
  • Code cleanup (non-breaking change which makes code smaller or more readable) – the key_locator thing
  • Documentation (a change to man pages or other documentation)

Checklist:

@nabijaczleweli
Copy link
Contributor Author

nabijaczleweli commented Mar 11, 2021

Not a clue so as to why why tests-sanity fails:

2021-03-11T20:36:41.3263127Z /var/tmp/test_results/20210311T201616/vdev_zaps/vdev_zaps_003_pos/stderr:tail: cannot open '48' for reading: No such file or directory
2021-03-11T20:36:41.3264528Z /var/tmp/test_results/20210311T201616/vdev_zaps/vdev_zaps_003_pos/stderr:sed: -e expression #1, char 5: unknown command: `
2021-03-11T20:36:41.3265507Z /var/tmp/test_results/20210311T201616/vdev_zaps/vdev_zaps_003_pos/stderr:'

appears unrelated, since I didn't touch that bit, similarly zloop (which doesn't fail every time, as a bonus).

buildbot/kernel.org fails because the kernel's too fresh, which I also (obviously) didn't touch.

@nabijaczleweli
Copy link
Contributor Author

nabijaczleweli commented Mar 11, 2021

According to this log, the FreeBSD build fails because it can't link to environ, which is odd since FreeBSD's environ(7) and posix_spawn(3) just say extern char **environ;, and that's what I wrote. I assume a FreeBSD user will know what on earth this means.

Copy link
Contributor

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a zfs property that will automatically execute commands seems a bit risky to me. Maybe making it opt in via a zfs-load-key flag would be safer? I'm not sure.

Comment on lines 439 to 542
#!/bin/sh -ue

if [ -z "$2" ]; then
echo "No dataset name (zfs-create?)" >&2
exit 2
fi

first="$(zfs get -Hpo value 'xyz.nabijaczleweli:bad-provider.salt' "$2")"
second="$(zfs get -Hpo value 'xyz.nabijaczleweli:bad-provider.saltmp' "$2")"


hash() {
pass=""
while [ -z "$pass" ]; do
read -rp "Passphrase: " pass
done

if command -v sha256 >/dev/null; then
sha256 -qs "$pass$1"
else
echo -n "$pass$1" | sha256sum | awk '{print $1}'
fi
}


case "$1" in
load)
if [ "$first" = "-" ] && [ "$second" != "-" ]; then
zfs set 'xyz.nabijaczleweli:bad-provider.salt'="$first" "$2"
zfs inherit 'xyz.nabijaczleweli:bad-provider.saltmp' "$2"
first="$second"
second="-"
elif [ "$first" = "-" ] && [ "$second" = "-" ]; then
echo "No state?" >&2
exit 3
elif [ "$first" != "-" ] && [ "$second" != "-" ]; then
which=""
while [ "$which" != "f" ] && [ "$which" != "s" ]; do
read -rp "Both states present! Select which one to use [fs]: " which
done
if [ "$which" = "f" ]; then
echo "If this is the right key, run '$0 shift $2' afterward"
else
echo "If this is the right key, run '$0 cancel $2' afterward"
first="$second"
fi
second="-"
fi

hash "$first" >&3
;;
new)
if [ "$second" != "-" ]; then
echo "Second slot occupied? Run 'zfs load-key [-n] $2' to resolve this." >&2
exit 4
fi

second="$(tr -cd '[:alnum:]' < /dev/urandom | dd bs=128 count=1 status=none)"
zfs set 'xyz.nabijaczleweli:bad-provider.saltmp'="$second" "$2"

hash "$second" >&3
;;
shift)
if [ "$first" != "-" ]; then
zfs inherit 'xyz.nabijaczleweli:bad-provider.salt' "$2"
first="-"
fi

zfs set 'xyz.nabijaczleweli:bad-provider.salt'="$second" "$2"
zfs inherit 'xyz.nabijaczleweli:bad-provider.saltmp' "$2"
first="$second"
second="-"
;;
unshift)
if [ "$second" != "-" ]; then
zfs inherit 'xyz.nabijaczleweli:bad-provider.saltmp' "$2"
second="-"
fi

zfs set 'xyz.nabijaczleweli:bad-provider.saltmp'="$first" "$2"
zfs inherit 'xyz.nabijaczleweli:bad-provider.salt' "$2"
second="$first"
first="-"
;;
cancel)
if [ "$second" != "-" ]; then
zfs inherit 'xyz.nabijaczleweli:bad-provider.saltmp' "$2"
second="-"
fi
;;
*)
echo "Unknown op $1" >&2
exit 5
;;
esac
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure a 100 line code example fits in the man page. (Note however that I'm also not a zfs maintainer or dev)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm quite sure it doesn't, but I wasn't sure where to put it so it made sense. I hope to be actionably told off by a maintainer to use some specific other useful place.

Comment on lines 1112 to 1132
specified absolute file path. If an exec URI is selected, the key will be
acquired from the executable specified (see EXEC:// in
.Xr zfs-load-key 8 )
for details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading the two man pages, it's not immediately clear to me, unless I also read the code examples, that this command receives actions via arguments passed to it. I think the limitation that this can only take a single executable and not additional flags for the command should be documented.

Copy link
Contributor Author

@nabijaczleweli nabijaczleweli Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Granted, it's at the end of that list, but at the beginning of the sexion, I'm not sure how I'd make it clearer?

The provider is invoked with its full path as its zeroeth argument, the operation (lower-case) as its first argument, and the dataset name, if any, as its third argument.

I mean, it's not like a path can contain flags, is it?

@ericonr
Copy link
Contributor

ericonr commented Mar 11, 2021

About freebsd linking, I found https://bugs.webkit.org/show_bug.cgi?id=138420 and https://lists.freebsd.org/pipermail/freebsd-current/2018-December/072542.html .

The first one points at -Wl,--no-undefined, because environ comes from crt1.o on FreeBSD.

@nabijaczleweli nabijaczleweli force-pushed the keylocation-exec-v1 branch 2 times, most recently from 7641b21 to 38685b7 Compare March 12, 2021 00:21
@nabijaczleweli
Copy link
Contributor Author

Ugh, turns out -z defs amounts to the same thing, which was, uh, non-obvious. I've disabled it on FreeBSD. Thanks for the pointer!

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Mar 12, 2021
@nabijaczleweli nabijaczleweli force-pushed the keylocation-exec-v1 branch 2 times, most recently from de4ce9b to 4580dd6 Compare March 12, 2021 19:50
@adamdmoss
Copy link
Contributor

I'm a bit dubious that this is the right approach, partly for usability reasons (that manpage is a real mouthful IMHO) but overwhelmingly for security reasons. The implication that a tampered or unverified pool can implicitly choose to run arbitrary commands on a host system feels like it's going to be a problem.

@nabijaczleweli
Copy link
Contributor Author

nabijaczleweli commented Mar 29, 2021

If you have a better (either more resilient or simpler+as resilient) design, I'd love to hear it :)

As for "the standard is a real mouthful" – sure, and? I fail to see how that's bad. I don't think I've left a damned thing underspecified. (If you've only read it in the GitHub view, I implore you to use an actual manual pager, it's much less annoying there.)

I mean, I haven't really considered tampered-with pools to be a thing worth worrying about (especially given that all you get is /exe/cutable load pool/data/set). I mean, if you wanna get real paranoid about it, it already reads arbitrary files, usually as root. You can either get a simple oracle (file matches/doesn't) or try to Have Fun by reading /dev/pts/3.
As for a solution to this – how about something like $LIBZFS_UNTRUSTED, when set and nonempty (and when not set and in "untrusted environment", i.e. suid or sgid), to print and error out before invoking any key provider back-end (i.e. at the top of get_key_material_exec() (or, indeed, get_key_material()#case ZFS_KEYLOCATION_URI:) and notify_encryption_backend())?

@adamdmoss
Copy link
Contributor

`zfs list -H -o :myrecipe` tank/foo | zfs load-key tank/foo

if you're sure that having a phrase-generating program specified on the dataset/pool itself is really what you want.

@nabijaczleweli
Copy link
Contributor Author

Updated with security considerations sexion and $ZFS_UNTRUSTED-controlled URI keylocation limiting

@bghira
Copy link

bghira commented Mar 29, 2021

`zfs list -H -o :myrecipe` tank/foo | zfs load-key tank/foo

if you're sure that having a phrase-generating program specified on the dataset/pool itself is really what you want.

zedlets have this handled already, that should be reused.

@nabijaczleweli
Copy link
Contributor Author

Rebased on master.

@nabijaczleweli
Copy link
Contributor Author

Rebased and cleaned up a bit; needs test-suite entries, but should otherwise be good to review

@nabijaczleweli
Copy link
Contributor Author

Updated to only take exec://[^/]+ and run providers from $ZFS_PROVIDER_DIR, then /etc/zfs/providers.d, then /usr/libexec/zfs/providers.d (and use fork()/exec() instead of posix_spawn() as a consequence, so we don't need environ). Should assuage security concerns.

@nabijaczleweli nabijaczleweli force-pushed the keylocation-exec-v1 branch 4 times, most recently from b6eb689 to e5cbd50 Compare May 8, 2021 18:57
@nabijaczleweli nabijaczleweli force-pushed the keylocation-exec-v1 branch 3 times, most recently from 8f51200 to 39e0cfb Compare May 22, 2021 01:08
This acquires the key material by running the file specified by the URI
with [path, op, fsname], and reading back data from the pipe
located at fd 3 after the child exits (this means, that, i.a.,
children that write Too Much get an I/O error instead of hanging,
exec:// providers can be written in any language that can
write(3, [buf]), and they can be as interactive (or non-interactive)
and as verbose (or terse) as they want)

See zfs-change-key(8) for example statesome providers, or the
abomination below for a trivial stateless one
	#!/bin/sh -x
	echo "$0" "$@"

	[ -z "$2" ] && {
		echo "No dataset name (zfs-create?)" >&2
		exit 1
	}

	if command -v sha256 >/dev/null; then
		sha256 -qs "$2"
	else
		echo -n "$2" | sha256sum | awk '{print $1}'
	fi | tee /dev/stderr >&3

See zfs-change-key(8) for a user-level description of key-providers,
or below for state machines

load:
  * [_ _] => error
  * [_ x] => [x _], unseal(x)
  * [o x] =>
    + show error
    + let user choose to try either one or the other state
    + instruct what to invoke in either case
  * [o _] => unseal(o)

new: into staging area
  * fresh       : [_ _] => [_ x]
  * regenerating: [o _] => [o x]
  * dirty:        [? x] =>

shift (on success): mark new state as current, free old state
  * [_ _] => how?
  * [_ x] => [x _]
  * [o x] => [x _], free(o)
  * [o _] => [_ _], free(o)
i.e.
  [a b] => [b _], free(a)

unshift (on deletion): move current state to new
  * [_ _] => how?
  * [_ x] => wrong
  * [o x] => wrong
  * [o _] => [_ o]
i.e.
  [a b] => [_ a] (technically free(b), i guess, but shouldn't happen)

cancel (on error): free new state if present
  * how?                         : [_ _] =>
  * from new                     : [_ x] => [_ _] free(x)
  * from new                     : [o x] => [o _] free(x)
  * from inherit/other executable: [o _] => [o _]
i.e.
  [a, b] => [a, _], free(b)

two stable states:
[_ _] -> new: [_ n]     --ok----> shift : [n _]
[_ _] -> new: [_ n]     --error-> cancel: [n _]

[o _] -> new: [o n]     --ok----> shift : [n _], free(o)
[o _] -> new: [o n]     --error-> cancel: [o _], free(n)

[o _] (inheriting)      --ok----> shift : [_ _], free(o)
[o _] (inheriting)      --error-> cancel: [o _]
inheriting homomorphic to switching to something else

[o _] -> unshift: [_ o] --ok----> cancel: [_ _], free(o)
[o _] -> unshift: [_ o] --error-> shift : [o _]

if shift or cancel wasn't called:
[_ o] -> load -> [o, _], unseal
[ó o] -> load -> pick a resolution. should allow loading either
                 to check with zfs load-key -n and instruct
                 what to do in either case

[_ o] -> new -> error? try to load first, and pick a resolution
[ó o] -> new -> error? try to load first, and pick a resolution

i.e. somehow libzfs failed to do the shift after committing

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
@grahamc
Copy link
Contributor

grahamc commented Jul 6, 2021

I echo the concerns of a dataset having defined a malicious key executable, which to me feels like a confused deputy CVE waiting to happen.

My scenario is a bit different:

  • the location of the program I want to run will change on a regular basis. This is because it will be built with Nix. This means storing the program's path in an property has the wrong data "lifecycle".
  • I can easily manipulate and customize the boot scripts.

To quote @nabijaczleweli:

a custom toolchain reimplementing parts of the zfs(8) API badly, and to essentially hijack the boot scripts. This Blows Massively™.

This rings true to me: needing to iterate over all the datasets of all the pools to find encrypted datasets, collect the unique set of encryptionroots, then prompt for a passphrase for each ... this gets pretty ugly. It may be that a channel program makes this easier, but it is still not something I think users should have to reimplement.

An alternative I'd like is: zfs load-keys [-r] [-e KEYPROGRAM] -a | filesystem ie: a flag to specify an executable which is able to provide passphrases and key material. I'd expect this program's API to be:

  • KEYPROGRAM "$encryptionroot "$keyformat"
  • may be interactive
  • prints the key material or passphrase on stdout
  • exiting non-zero skips the given filesystem
  • is executed multiple times for multiple attempts, if the secret on stdout is incorrect

This avoids the confused deputy by requiring the caller to specify the program, allows for robustly integrating with ZFS's decryption mechanism without reimplementing a significant amount of work, and leaves the implementation up to the user.

I hacked together an example based on this PR: grahamc@a176913

grahamc added a commit to grahamc/zfs that referenced this pull request Jul 6, 2021
Given a `password.sh`:

```sh
promptpass "$1" >&3
```

prompt for each password with:

```
zfs load-key -e ./password.sh -a
```

Most of the complicated parts of this code are lifted from
openzfs#11731

Co-authored-by: <nabijaczleweli@nabijaczleweli.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants