Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-50518: Add binding for crypt_r to ocaml/auth #5916

Merged
merged 3 commits into from
Aug 8, 2024

Conversation

contificate
Copy link
Contributor

@contificate contificate commented Aug 6, 2024

This change adds a C stub for the crypt_r function and exposes it from the pam library:

val crypt_r : key:string -> setting:string -> string option

The intention of introducing this function is to use it for hashing passwords in a future authentication cache implementation: the fact we already link and use libcrypt.so simplifies export compliance (initial work used PBKDF2).

Some basic testing is also provided as quicktests. On a recently provisioned machine with these changes, all of the tests pass:

image

The testing format benefits from being a quick test because it must be tested on a target machine (as the behaviour of crypt_r varies between implementations and is up to the implementation used on hosts). The algorithms we are probably going to use (SHA-256 and SHA-512) seems to actually be a GNU extension. Other implementations are not guaranteed to provide the same functionality.

There is opportunity for writing more tests so, if anyone has any suggestions, please comment.
There is some commentary within the commit messages.


Longer term, I think we should go about addressing the export compliance issues so we can use PBKDF2 instead (from mirage). For example, Python notably deprecated its crypt module (which binds the same function) because of reasons akin to those implementation-specific issues above (see here).

ocaml/auth/pam.ml Outdated Show resolved Hide resolved
@contificate contificate force-pushed the crypt_r branch 2 times, most recently from c4935fe to e85ea98 Compare August 6, 2024 15:21
@last-genius
Copy link
Contributor

last-genius commented Aug 7, 2024

I understand that crypto stuff is carefully watched over in this project, but would it not be better to provide a "safer" interface on the OCaml side? i.e. crypt_r would wrap the provided salt like: "$6$" ^ salt before calling into the stub, forcing the usage of better algorithms, and allowing easier transition from one to the other (if need be) in the future. Alternatively, it could take some (polymorphic?) variant of SHA_256 | SHA_512 and convert the salt to the required string.

At the moment, given that there is no documentation for crypto_r introduced here (and man pages seem not to talk about setting types at all), it'd be all too easy for some usage to be less safe than other usages of this function.

We are already testing "$6$" is present in the tests, so we could use it here too, concerns of portability are the same.

UPD: This would require providing some alternative facilities of checking if two strings hash the same, to not prefix the given salt with something, but to take it as it is, for the following to work:

  h  = crypt_r(k, s)
  h' = crypt_r(k', h)
  equal = h == h'

@contificate
Copy link
Contributor Author

I understand that crypto stuff is carefully watched over in this project, but would it not be better to provide a "safer" interface on the OCaml side? i.e. crypt_r would wrap the provided salt like: "$6$" ^ salt before calling into the stub, forcing the usage of better algorithms, and allowing easier transition from one to the other (if need be) in the future. Alternatively, it could take some (polymorphic?) variant of SHA_256 | SHA_512 and convert the salt to the required string.

At the moment, given that there is no documentation for crypto_r introduced here (and man pages seem not to talk about setting types at all), it'd be all too easy for some usage to be less safe than other usages of this function.

We are already testing "$6$" is present in the tests, so we could use it here too, concerns of portability are the same.

Yes, a thin wrapper around the stub (and then hiding the stub) that provides a safer API would be a good follow-up. This was discussed already but currently the supposed use case is in one place (authentication cache, as it simplifies export compliance): it is not recommended to use this elsewhere.

That said, I still think the stub itself needs to be tested directly as done here: it exercises expectations that we don't want to change if anyone ever links a different implementation when installing to a host.

@last-genius
Copy link
Contributor

If it's not recommended to use it elsewhere, could we call it unsafe_crypt_r 🦀?

@contificate
Copy link
Contributor Author

UPD: This would require providing some alternative facilities of checking if two strings hash the same, to not prefix the given salt with something, but to take it as it is, for the following to work:

  h  = crypt_r(k, s)
  h' = crypt_r(k', h)
  equal = h == h'

It wouldn't require alternative facilities, but it would be better to have a specific function (as is common elsewhere, e.g. here).

The above pseudocode would still be the valid usage pattern as it's how crypt_r works. So, a safer API could use the same pattern:

let h = Pam.crypt ~algo:SHA256 ~key:k ~salt:s in
let h' = Pam.crypt ~algo:SHA256 ~key:k' ~salt:h in
h = h'

I do agree, though, that a safer API would just expose a simpler function (as linked above, e.g. password_equal) to avoid misuse (like specifying the wrong algo).

@contificate contificate marked this pull request as draft August 7, 2024 10:10
@contificate
Copy link
Contributor Author

Converted to a draft as we reconsider how this should be exposed. I will try to use alerting attributes to dissuade casual usage. I may attempt to simplify the multiple hashing test, as it probably doesn't achieve the important invariant (despite working).

@contificate
Copy link
Contributor Author

For those not following this PR, the latest force updates make the following changes:

  • crypt_r is renamed to unsafe_crypt_r and annotated with an alert to discourage casual usage.
  • The "test many threads" test is changed to compute the expected outcomes sequentially before spawning any of the threads to ensure the "golden" sample is not corrupted. The main thread now busy waits for the test threads to identify themselves as having been spawned, before broadcasting them to start. There is a slight loophole in that Condition.wait (within the test thread) could return (without having been signalled) and test the condition to be true before the main thread broadcasts the same. I expect this is unlikely to happen and that the tick thread's time slice would soon enough yield to the main thread to signal the rest to catch up. Could probably guard the test threads with a separate predicate from the main thread, which is only made true once the main thread's predicate is satisfied.

@contificate
Copy link
Contributor Author

Thanks for the reviews. I will mark this as ready for further review once I have written a short (safer) API that wraps the unsafe stub: along with introducing a test that demonstrates their correspondence.

@contificate
Copy link
Contributor Author

image

Did a clean run on a host with the recent test additions.
There is opportunity to refine the contributions of this PR at a later date, but I'd like to stage it.

@contificate contificate marked this pull request as ready for review August 8, 2024 12:01
This change adds a binding stub for the crypt_r ("re-entrant crypt") function.

The introduced external function is of type:

  unsafe_crypt_r : key:string -> setting:string -> string option

The arguments are labelled to avoid mixup.

In the case of failure, None is returned (no explicit error). Otherwise, the
result is Some h, where h is the string containing the computed hash.

The function is annotated as being unsafe (via [@@Alert ...]). This is because
it is simpler for it to be exposed for testing purposes. A safer API that wraps
this is more preferable.

The usage pattern that most are familiar with is:

  h  = crypt_r(k, s)
  h' = crypt_r(k', s)
  equal = h == h'

However, the following is also a valid way of expressing the same, and doesn't
require extracting the salt (which is embedded in the resultant hash string):

  h  = crypt_r(k, s)
  h' = crypt_r(k', h)
  equal = h == h'

There is potential for more error handling, but the largest potential for error
is in supplying well formed inputs. Different implementations support different
errnos, which complicates how accurate any attempt at error reporting would be.

The difficulty with over-specifying the requirements of this function at the
API level is that different implementations may have different behaviour. To
this end, any unit testis exercising this binding will need to explicitly test
invariants we expect our specific implementation to have. The precise
implementation of the function installed on the host is up to the shared
library we link against (which, in practice, I've found to differ in behaviour
from what is installed on my host Linux machine).

Signed-off-by: Colin James <colin.barr@cloud.com>
The following small tests are provided as a quicktest suite to run on hosts:

- Valid salts: a few valid salts are confirmed to compute a hash.
- Invalid salts: a few erronerous salts are confirmed to fail.
- Implicit salt truncation: the behaviour that any salt longer than a specified
  maximum (at present, 16) does not cause hash computation to fail, but rather
  its value gets implicitly truncated by the algorithm.
- Increasing string length: strings from language 'a'+ are tested in increasing
  length to ensure they compute to a set of hashes that are pairwise distinct.
  This test is really to ensure that the algorithm does not cap the maximum key
  length - which would be implicitly truncated on our behalf and cause lengths
  over a certain threshold to hash to equivalent values. This property is worth
  tracking, currently it would appear there is no limit to worry about.
- C style termination: OCaml strings don't rely on a C-style null terminator
  character ('\0') to determine their length. Consequently, '\0' can appear
  anywhere in OCaml strings without impeding various computations on strings.
  This test exercises the property that the C API expectedly does stop reading
  after seeing '\0'. This property should not be relied upon and its behaviour is
  documented here as a test.
- Multiple threads hashing simultaneously: this test exercises the property
  that crypt_r truly is re-entrant. The test spawns a few threads with a
  precomputed hash (and the inputs used to compute it) and then spends ~200ms
  iteratively attempting to compute the same hash.
  If any hash computed within this ~200ms window differs from the
  initial hash, or any hash fails to be computed, the test fails. This property
  must be tested because the C stub temporarily relinquishes the runtime lock
  when computing the hash, thus allowing the possibility that hashes are
  computed in parallel. It is not committed here but this test was shown to be
  well formed by using the non-reentrant "crypt" function in place of crypt_r,
  where the test reliably fails. Switching this test over crypt_r allows it to
  pass.

It is important that the tests are actually executed on a host - as the
implementation of crypt_r (from libcrypt.so) could differ across development
and target machines, but also across updates to XenServer. Some of the tests are
for regression purposes only, to ensure that certain expectations are retained
from time of writing.

Signed-off-by: Colin James <colin.barr@cloud.com>
The following function is added to the Pam module:

  val crypt : ~algo:crypt_algo ~key ~salt

Simple tests are also provided for this function.

Signed-off-by: Colin James <colin.barr@cloud.com>
@contificate contificate merged commit a41c3fe into xapi-project:master Aug 8, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants