Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow callers to choose the entropy source for the random endpoints. #15213

Merged
merged 12 commits into from
May 2, 2022

Conversation

sgmiller
Copy link
Collaborator

@sgmiller sgmiller commented Apr 28, 2022

They can choose "platform" to get the status quo behavior, "seal" to get
entropy augmentation, or "all" to get an XORed combination of both.

Addresses VAULT-5850.

@sgmiller sgmiller requested review from a team and lucymhdavies and removed request for taoism4504 April 28, 2022 18:49
@lucymhdavies
Copy link
Contributor

lucymhdavies commented Apr 29, 2022

Interesting. I hadn't considered the possibility that callers would want to specify what entropy source to use.
I cannot think of a use-case where a user would explicitly want to use platform, which would be a lower entropy source of randomness, but I also cannot rule out people needing to do that for some reason. Is it slightly faster perhaps?

What's the thinking behind having platform as the default source? Thinking in terms of secure defaults, if seal or all was the default source, then we would default to higher entropy where available, with a fallback to platform where that's not available.

Also not entirely sure what the use-case would be for all. I'm not a crypto expert, so I don't know whether the final random bytes you'd end up with from platform XOR seal would be higher entropy than just using seal directly.

@cipherboy
Copy link
Contributor

Is it slightly faster perhaps?

While not including Go's overhead, Go reads from the same source /dev/urandom (via Linux getrandom syscall). A very crude test by reading using dd:

$ dd if=/dev/urandom of=/dev/null bs=1024 count=$(( 1024*1024 ))
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.55456 s, 236 MB/s
$ uname -r
5.16.20-200.fc35.x86_64

Any external RNG is going to have a hard time beating that, especially with the upcoming improvements in Linux Kernel 5.17 and 5.18... without perhaps a 10Gb/s link or a PCI-e attached HSM? :-) I believe they're essentially pushing the speed limits of the chosen internal RNG algorithms, so having them on-system aught to make them lower latency (which is favorable for the API requests we're responding to here).

With (Kernel) FIPS certification, my understanding was the performance numbers didn't actually change all that much, and for people who like that, it should have whatever security properties they're concerned about a non-FIPS host not having?

That said though, I'm not actually finding any actual quantitative measurements of HSM RNG speed. :(

Copy link
Contributor

@cipherboy cipherboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two nits otherwise looks good. :-)

I do think it gets a little messy with both "source detection" and "multi-path reading", the option of using GenerateRandomBytesWithReader might help clean that up.

warning = "no seal/entropy augmentation available, using platform entropy source"
}
randBytes = make([]byte, bytes)
_, err = io.ReadFull(b.GetRandomReader(), randBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is uuid.GenerateRandomBytesWithReader, which handles this for you if you want.

@@ -664,6 +664,11 @@ This endpoint returns high-quality random bytes of the specified length.
- `format` `(string: "base64")` – Specifies the output encoding. Valid options
are `hex` or `base64`.

- `source` `(string: "platform")` - Specifies the source of the requested entropy.
`platform`, the default, sources entropy from the platform's entropy source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding, and correct me if I'm wrong here, but there's a subtle distinction here between random data (i.e., output of a RNG construction) and entropy (which is slower, used to seed the RNG, and arguably of higher quality (e.g,. thermal noise measurements) -- but of limited size). While /dev/urandom is frequently used to seed an RNG (chaining), it internally has a RNG and should be considered the former, not the latter.

My 2c., but since we've so far (IMO correctly) called it "random bytes" or wording to that effect, I don't think we should give the impression that we're maintaining an internal RNG construction of our own... (that we're seeding from an external entropy source). Instead, we should clarify we simply ferry it from another RNG

(I guess you could argue all creates its own RNG, albeit with a trivial construction :D).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's quite fair, will change.

@lucymhdavies
Copy link
Contributor

Any external RNG is going to have a hard time beating that

Good point. Still not sure if the quick response from Vault for these requests is enough of a big deal compared to the benefits of sourcing external entropy from an HSM. Quantitative measures of HSM speed would indeed be beneficial on that point.

i.e. I'm still not entirely convinced that platform should remain the default simply because it's maintaining status quo; I can't imagine that switching to seal as the default would be backwards-incompatible in any meaningful way.
But this is probably just personal preference (or my lack of imagination). I'm happy either way.

@cipherboy
Copy link
Contributor

cipherboy commented Apr 29, 2022

Notably, we're not actually sourcing entropy to seed our own PRNG -- in either case (seal vs platform), we're sourcing random values from the specified PRNG. Output from a PRNG should be indistinguishable from that of an entropy source, but, the caveat is that NIST SP 800-90C would still need to apply for chaining your own PRNG off of these values (differentiating entropy from PRNG output). Most FIPS Kernel PRNGs updated for FIPS 140-3 will conform to SP800-90C, hence my suggestion of using a FIPS distro.

Secondly, the vast majority of vault users probably don't have seal wrap enabled, so the warning about falling back to platform randomness would be alarming but expected. I don't think we should be alarming anyone, as there's nothing wrong with platform entropy.

My 2c., but a lot of people (corporations) trust -- and have verified -- the Linux Kernel's RNG (whether the default or a downstream vendors') and my personal belief is that its had a lot more security review than any proprietary HSM vendor's RNG. I don't fundamentally feel HSMs bring anything of value to entropy as a service; modern RNG construction is such that the two should mostly be equivalent, minus bugs. :-) This not to say they don't have value elsewhere. But these are my own opinions on the matter.

@sgmiller
Copy link
Collaborator Author

@cipherboy covered a lot of what I would have, but the naming is the tricky bit. I don't think that many people know that Vault consumes the platform RNG directly as the status quo, which it does for all the reasons stated. But having the default source the HSM could be a pretty significant change for people's existing workflows if for example the request went from taking a millisecond or two to a sizeable time due to network roundtrip to an HSM and the variable capabilities of different types of HSMs.

I do feel a bit that the naming may be less than optimal and/or create doubt to users who don't understand the distinctions. Other possibilities I considered are "internal" (status quo) and "external", but external is pretty vague. We could also drop platform for "default".

I included "all" because the results of XORing two chunks of entropy never reduces and can only increase the amount of entropy, so for the paranoid cryptographer it's a conservative choice which guards against the (albeit unlikely) failure or degradation of either underlying source.

Copy link
Contributor

@lucymhdavies lucymhdavies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood and agree with both your points there :)

And yes, for sure, naming is one of the hard problems of software engineering.

Copy link
Contributor

@cipherboy cipherboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactoring looks great, though there are some make fmt issues :-)

I'm not sure I understand the logical_system_test.go:3459 failure though.

@sgmiller
Copy link
Collaborator Author

sgmiller commented May 2, 2022

Somehow I lost a commit from main to introduce that test failure.

@sgmiller sgmiller merged commit ca6e593 into main May 2, 2022
@sgmiller sgmiller deleted the sgm/random-entropy-source-selection branch May 2, 2022 19:42
@lucymhdavies
Copy link
Contributor

🎉

@cipherboy cipherboy added the fips label Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants