Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RVPS Improvements Brainstorm #637

Open
fitzthum opened this issue Dec 18, 2024 · 8 comments
Open

RVPS Improvements Brainstorm #637

fitzthum opened this issue Dec 18, 2024 · 8 comments

Comments

@fitzthum
Copy link
Member

fitzthum commented Dec 18, 2024

We've discussed changing the RVPS in #238 and #350 and #407 (cc @thomas-fossati, @Xynnn007, and @deeglaze), but we've yet to really figure out what we should do. I have a few concrete ideas.

First a little overview of what we have. There are three important parts of the RVPS.

  1. The interface to the attestation service where we use reference values to evaluate the AS policy. This API is pretty simple. It basically looks like this

    pub async fn get_digests(&self) -> Result<HashMap<String, Vec<String>>>
    

    The RVPS takes no input and returns a map of ids to vectors of reference values (usually hashes). We're assuming that for each id there could be multiple valid values.

  2. The internal representation of reference values.

    pub struct ReferenceValue {
        #[serde(default = "default_version")]
        pub version: String,
        pub name: String,
        #[serde(deserialize_with = "primitive_date_time_from_str")]
        pub expiration: DateTime<Utc>,
        #[serde(rename = "hash-value")]
        pub hash_value: Vec<HashValuePair>,
    } 
    

    Note that the internal representation is assuming that the reference values are hashes. This might not be correct. A reference value could be a version number or the policy could even calculate hashes internally. The HashValuePair struct has info about the hashing algorithm that we never pass to the client. Maybe we don't need it?

  3. Extractors produce the reference value struct above after verifying some input. Currently we only have a sample extractor and an in-toto extractor which uses go bindings. We also have a pre-processor which currently doesn't seem to do much and has a confusingly-named Ware interface.

Improvements

  • Add evidence groups/context I'm planning to add an abstraction that allows us to group reference values. In some ways this will be simple. We can change the get_digests API to take a group name. The AS policy won't be aware of the groups, it will continue to operate the same way using the reference value map, but the AS will be able to switch the contents of the map based on some context. The tricky question is how the AS should decide what group of values to use. We can implement this as a second step based on a local config, an init-data field, or something else.
  • Move rvps cli tool into kbs-client We currently have a separate tool for sending values to the RVPS. I think we should combine this with our kbs-client or a trustee cli tool that we might develop. My impression is that the easiest way to use the RVPS is to use the JSON storage backend and simply modify the file directly, skipping extraction and per-processing altogether. We need to make it easier to provision reference values, especially if we want people to write meaningful policies.
  • Fixup interfaces described above Maybe we should change the internal representation and clarify or get rid of the pre-processing step.
  • Add more extractors Self-explanatory

Adopting specifications

Personally I'm not very interested in adopting specifications just for the sake of it or in the name of hypothetical interoperability. Of course, there is value to using standards, but we should make sure our work in this area helps us move more quickly towards a usable and insightful project rather than bogging us down.

Probably the simplest way for us to interact with standards is in the extractor, allowing us to take various bundles of reference values and convert them to our internal representation. Not every spec is a good fit for the extractor, tho. For instance my impression of CoRIM is that its scope exceeds our current extractor interface. Something like RIM might be a better fit. My understanding here is still a bit fuzzy. The point is that unless we want to do a somewhat significant rework, we should be looking for specs that simply bundle reference values.

The RVPS is currently pretty simple and I think we should try to keep it that way or make it even simpler. Fundamentally, the job of the RVPS isn't very complicated. LMK if you have any ideas about this stuff. I will probably start working on some of the simpler stuff in January.

@deeglaze
Copy link

I'm not familiar enough with confidential containers to make a recommendation here since it is such a constrained application. My context has been in bringing up an attestation verification service that can support multiple types of evidence to verify different environments based on a body of endorsements, a data model for expressing what applied to the evidence, and a policy for processing that data model into attestation results.

In my understanding of the confidential containers RVPS, it's more like the SPIRE server where workloads are directly registered, and that's the only place they're registered. If it's there, it's permitted. There's very little in terms of supporting multiple clients, where your attestation service can be operated by a different party.

In my head, I've been thinking more along the lines of constructing an endorsement ecosystem that can be used by attestation services. An RVPS then provides a view of what's an appropriate knowledge base, but can also carry more ephemeral operator-provided endorsements the way there's an ingestion API already for the RVPS. You'll have attested ephemeral state of operating nodes that serve as reference values for other services, and you'll have more persistent reference values such as authenticity of some digest. You'll also have some endorsements in the middle, which carry "security posture" endorsements that result from periodic analysis reports that match software component analyses with CVE databases.

I've proposed this as a talk to the OC3 conference, so I hope to make this clearer in a recorded setting with visuals.

I don't know what an "extractor" is in your context. To me, I wrote an "extractor" to parse out the RIM details from the PCClient event log, since the SP800-155 event was designed for that purpose https://github.com/google/gce-tcb-verifier/blob/main/extract/extract.go

I'm planning to add an abstraction that allows us to group reference values.

Do you mean for this to segment trust domains or product lines? To use the generic framework of CoRIM, your attester itself gets to name its measured environment, so you can use the environment-map concept to segregate product lines and their endorsements. If your attester is more generic that that, then part of the evidence itself can be a self-identification: what does the node believe itself to be? You can use this to narrow the lookup query for reference values.

Self-identification would name the exact collection of endorsements that are appropriate to check. This is the concept of a CoBOM, a bill of material. You'd need to bake this into an unmeasured disk volume and use it only as a "hint", since policy ultimately decides which reference values are acceptable.

@fitzthum
Copy link
Member Author

I'm not familiar enough with confidential containers to make a recommendation here since it is such a constrained application. My context has been in bringing up an attestation verification service that can support multiple types of evidence to verify different environments based on a body of endorsements, a data model for expressing what applied to the evidence, and a policy for processing that data model into attestation results.

Trustee is also intended for use cases outside of CoCo; anything involving confidential attestation.

In my understanding of the confidential containers RVPS, it's more like the SPIRE server where workloads are directly registered, and that's the only place they're registered. If it's there, it's permitted. There's very little in terms of supporting multiple clients, where your attestation service can be operated by a different party.

One feature of the extractors is that they can check the signatures of reference value bundles, with the idea being that some reference values will be received from a third party. The entire RVPS could also be run remotely by someone else although we only support one and it's not widely tested. So there is some sense of a broader ecosystem at this point, but it's pretty simple. We also have an expiration field in our internal reference value representation.

Note that reference values registered in the RVPS may or may not map directly to a specific workload.

Do you mean for this to segment trust domains or product lines?

Either way. At the RVPS level the grouping mechanism would be totally generic. The AS has a couple of different options on how to use the feature. I haven't totally figured out what I prefer here, but one option would be to allow the group to be selected via init-data. Init-data is the coco spec for stuff that gets put into fields like hostdata by the host. The host/orchestrator could map groups to specific workloads, broader workload types, hardware platforms, or anything else. Ultimately the group id would be reported in the attestation token and the KBS policy would check that it is as expected in the context of resource requests.

@thomas-fossati
Copy link
Contributor

thomas-fossati commented Dec 19, 2024

On the “Adopting specifications” topic, two considerations (likely to change in the future, but valid on 2024/12/19):

  1. The CoRIM spec has not yet fully stabilised, and
  2. There is no OSS implementation available in Rust.

Given those, at this point, focusing on getting the interfaces right seems like the right investment.
If you modularise the ingest (is this what you call extractors?) and have a sensible internal representation and APIs, then adding CoRIM, CycloneDX, in-toto, or any other suitable format - including proprietary ones - should amount to adding a new ingest plugin and a synthesizer for format-specific IDs.

On the ID topic, one suggestion that I'd like to provide is to look at CoRIM’s environment-map (i.e., roughly, non-empty<{ ? instance-id , ? class-id }>) which has a sensible shape, at least for the use cases I have seen. You can define a serialisation for it (e.g., URI-based, detCBOR, or other) and your pattern-matching rules.

On the “group” topic, I have a clarifying question: by group you mean a collection of claims that belong to the same environment (i.e., the "semantic squashing" topic I raised back in the days? Or something else?

Just in case you haven't seen it, a while ago I assembled a few ideas on how to evolve RVPS. Caveat: Those ideas may be entirely broken or obsolete :-)

@fitzthum
Copy link
Member Author

On the “group” topic, I have a clarifying question: by group you mean a collection of claims that belong to the same environment (i.e., the "semantic squashing" topic I raised #238? Or something else?

Yes, this is meant to help address that concern. The grouping mechanism is meant to be very generic. It could be used to express CoRIM stuff like a target environment or a manifest, but it could also be other divisions. I wonder if we should have two levels of groups. That might bring things a little closer to CoRIM.

Speaking of shape, one place we seem to differ from the shape you mention is that the RVPS returns HashMap<String, Vec<String>>. Basically regardless of any grouping mechanism, if there are any duplicate keys in the group, their values will be put into a list together. This comes from the fact that the policy can't support duplicate keys in the reference values and that we only have one set of reference values per policy. Probably this is fine although it might make it slightly harder/impossible to express multiple specific combinations of reference values.

@fitzthum
Copy link
Member Author

fitzthum commented Dec 19, 2024

Actually I wonder if we should introduce a reference uri similar to the resource uri. That would basically represent two groups and a name. We should also consider how to reflect this in the attestation token.

@thomas-fossati
Copy link
Contributor

thomas-fossati commented Dec 20, 2024

Actually I wonder if we should introduce a reference uri similar to the resource uri. That would basically represent two groups and a name.

Can you make a concrete example?

We should also consider how to reflect this in the attestation token.

Warning: Brainstorm material 😄

I think about this in terms of a couple of ID "synthesiser" interfaces. One on the RVPS side, at ingest, and another on the AS side, at verification. The former is computed on the reference value contents and returns the ID for storage, the latter is computed on evidence claims and returns the ID for lookup.

In Golang-ish terms, an interface:

type IDSynthesizer interface {
  FromRefvalue(v RefValue) ID
  FromEvidence(v Evidence) ID
}

that each "attestation scheme" (e.g., TDX, CCA, SEV-SNP) must implement. E.g.,

func (o CCA) FromEvidence(v Evidence) ID {
  return "cca:" + b64(v.ImplementationID) + "/" + b64(v.InstanceID)
}

func (o CCA) FromRefvalue(v RefValue) ID {
  return "cca:" + b64(v.EnvMap.ClassID) + "/" + b64(v.EnvMap.InstanceID)
}

@fitzthum
Copy link
Member Author

fitzthum commented Dec 20, 2024

Can you make a concrete example?

I am picturing a policy with something like this.

executables := 3 if {
	input.snp.launch_measurement == reference_value(rvps:///snp/launch_measurement/measurement1)
}

Or the URI could be rvps:///snp/manifest1/launch_measurement. In general I want this hierarchy to be user-defined and I don't want to assume there is any unique identification of guests. This URI might help the policy to be less vague about which reference values are being used. That said, it could be a big mistake to bake the whole thing into the policy itself. Maybe the first two groups, or one of them, should be set outside the policy so that they can be changed without changing the policy. There's an interesting tension here between having the policy be explicit about the reference values and having the policy be flexible and not need to be updated.

Regorus, the crate we use for doing OPA in Rust, allows you to register extension functions in Rust that can be called within the policy. I am thinking about using this to provide reference values for reasons that I will describe shortly.

I think about this in terms of a couple of ID "synthesiser" interfaces. One on the RVPS side, at ingest, and another on the AS side, at verification.

I like the idea of having some mechanism to keep track of what reference values were used to evaluate the policy. What comes to my mind is to have the RVPS generate a report of the reference values that were requested (perhaps as JWT) and have that stored in an extension in the EAR Appraisal.

By using the Regorus extension reference_value function shown above, we could keep track of exactly which values are used when evaluating a policy (even if the policy also mentions other values for other platforms that are not executed). I'm not totally sure what would go in this report. One option would be to just put the reference values themselves there, but this seems kind of clunky especially since we have all the tcb evidence in the attestation token already. Instead maybe we would just store the RVPS UUID of each reference value. Someone who has received an attestation token could in theory take the report and ask the RVPS to validate it. The RVPS could then call up the reference values and say whether they have been revoked or whatever else.

Anyway I think reporting the reference values isn't super high-priority since they are trusted at the time of verification, but it would be good to have some story about how to do this.

@fitzthum
Copy link
Member Author

fitzthum commented Jan 3, 2025

I thought about this a bit over the holidays. Here's what I'm thinking.

Interface with Policies

Add two Regorus extension functions to provide reference values. One would be called get_reference_value and would take two arguments, a group, and a name. This would return one reference value. The other would be get_reference_values. This would only take a group argument and would return all the reference values in that group.

Policies could look like this

executables := 3 if {
	input.snp.launch_measurement == get_reference_value("snp1","launch_measurement")
}

or like this

executables := 3 if {
	input.snp.launch_measurement in get_reference_values("snp_launch_measurements")
}

I'm not totally sure how the types would work or if we can actually return a list, but assuming no problems this is a pretty flexible interface.

Implementing Extension Functions

The main reason to use extension functions is to keep track of which reference values are actually used to evaluate the policy. We'll need to test to make sure Regorus only calls the ones we expect.

The interface of the RVPS will need to change a little bit. First we will add get_reference_value and get_reference_values functions. These will each take an extra argument, a primary group, that won't be exposed to the policy, but will allow the RVPS to switch reference value contexts. Since we want to keep track of all the reference values we get, these functions will be part of a new session struct. I'm not sure what to call this, but the flow would look something like this.

// before evaluating policy
let s = ReferenceValueSession::new();

// inside the extension
s.reference_value("default", "snp1", "launch_measurement");

// after evaluating the policy
let reference_value_report = s.report();

Reference Value Report

The attestation token should probably have some record of the reference values that were used. Otherwise, the policy isn't particularly meaningful. Of course we should also have some reference to the policy that was used. Our current approach to this isn't great, but that's a separate problem.

I think the RVPS should be able to generate a reference value report. I don't want to put the reference values themselves in this report. In the first iteration, the body of the report could look like this.

[{"reference_uri":"rvps://<rvps-domain-name-or-ip>/default/snp1/launch_measurement",
  "uuid": "<some-uuid>",
  "reference_value_timestamp": "<time-rv-was-added"
}]

There are a lot of possibilities for the report. We could add fields to keep track of who provided the reference value to the RVPS. We could sign the report. We could add an interface to the RVPS to allow people to verify their reports and see if any of the values have been revoked. For now, let's just keep track of the UUID of the various reference values.

We will add this report to the appraisal as an extension (although note that we might need to resolve veraison/rust-ear#32 first).

Changing internal representation

The changes mentioned here will require tweaking our internal ReferenceValue representation a little bit. First, we'll add a UUID to each reference value and a revoked flag. It will require a few iterations to get the whole process in place (especially figuring out how to check that the party revoking an RV is the right one), but the basic idea is that we'll use a UUID to track each value and when they are revoked we won't delete them, we'll just change the flag.

We probably also want to change things to support reference values that are just strings alongside ones that are hashes. In the future we might track who registered the RVs.

Interface with operator and RVP

One of the weakest points of the RVPS is the workflow for providing reference values. I don't have a concrete plan here. We need to take a look at the tooling and interfaces and make it really easy for people to enroll their values. I want to start with simplifying the process for RVPS operators who just want to enroll their RVs directly and then move on to figuring out how RVPs can do this remotely, eventually linking this into the Kata CI perhaps.

Risks

There are a couple implementation pitfalls here, but the biggest risk is that this proposal would be a breaking change for policies. It would remove the reference value map that is provided to policies. We could potentially maintain this interface and just use it for simple tokens, but I would prefer keeping the implementations closer to each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants