-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Device Attestation Support #650
Comments
A compromised AA could also omit some devices from its generated attestation report collections. The assumption that devices can be attested and verified independently makes sense to me.
Yes. And since those two attestation paths are independent and serve different purposes, they have to run at different time. The initial one is for the guest kernel to verify that the device is acceptable. It could be a local attestation process, or a remote one with a policy that would be orthogonal to the one that you're describing here. It could also be that the guest kernel simply accepts devices if they can be authenticated through PCI. It really is a guest kernel provisioned policy. To go even further I think your proposal should start from the assumption that all devices have been accepted by the guest kernel, and since it is measured as well, we can trust that it safely extended the guest TCB with all detected devices. So the fact that the guest kernel may have requested a device to be attested at boot tim is an implementation detail. |
I feel |
is there a reason why we shouldn't link the "cpu" and "gpu" TEE HWs, in the attestion in some hiararchical or chained fashion? (e.g. mix the attestation-report of cpu0 in the report-data of gpu0) |
We could include a hash of the cpu attestation report in the SPDM |
With the assumptions I'm making about the TDM I don't think this is necessary and it makes things simpler if the attesters (and verifiers) can be executed independently. |
Btw, for the h100 the nvidia kernel module does some level of verification. This could potentially take the place of the TDM in the short term. Also, I've been thinking more about the relationship between Trustee and TDM. The question is really about where the policy is evaluated (or which policy is evaluated by each component) and there are some interesting tradeoffs. It might seem like the TDM is the most important thing because it determines whether a device can join the TCB (including after a secret has been released), but the TDM doesn't really know anything about which secrets have been or will be requested. The components have two different perspectives on the guest. |
What about running a second Trustee inside the guest as the TDM. We already have the KBS separated into a crate and API interfaces. We could add another interface that hooks into whatever the kernel uses. We would map approving a device bind onto requesting a resource from Trustee with given evidence. We could either request some random placeholder resource or add a plugin to grant some kind of certificate. Trustee inside the guest and Trustee for remote attestation could then share the same verifiers, policies, and reference values. |
This is a proposal for how we can extend the KBS protocol, attestation-agent, and Trustee to support attesting additional devices alongside the CPU.
This proposal does not specify what a device attester or verifier should do. It only concerns how these components can be integrated with our existing architecture. There are still a few open questions.
Introduce
DeviceClass
ParameterCurrently we identify the CPU in terms of a
TeeType
. To support devices we should introduce an additional classification, which specifies the type of device. For instance,gpu
. An additional method will be added to the attester trait to allow the attester to report the device class. We could introduce a device class enum in kbs-types or just allow an attester to specify a string. EIther way, later sections will make it clear why this is a useful field.Trigger All Applicable Attesters
Currently we only trigger one attester per guest. We will change the AA to trigger the attester for any device where the detection heuristic is positive. The attesters will not be run in any particular order. See end section for some thoughts on this.
The evidence from all the attesters will be combined into a field that looks something like this.
To be clear, this is a list of all the devices, where each device is a dictionary containing the class, the type, and the actual evidence. The
type
field here is what we currently call theTeeType
. This is specific to an attester (there should only be one attester per type) whereas a class is broader. Note that one attester could potentially cover different models of a device, which would still be considered to have the same type (like we can have different AmdSnp CPUs). In the example above, we have an SNP guest with two GPUs.(Also note that we will extend the earlier phases of the kbs-protocol as well so that all the different types and classes are reported on the first request. The KBS will respond by issueing a nonce for each attester to account for verifiers that have different requirements for nonces i.e. ITA.)
Run a policy for each device
Once the evidence gets to Trustee, it's clear that we're going to run the verifier corresponding to each device. The tricky question is what we do with all the TCB claims that come out of these verifiers. Fortunately our switch to EAR tokens gives us a path forward here. I'm not sure the best way to handle this with simple tokens. I will leave that to @Xynnn007
Once we get TCB claims from the verifiers, we will run the policy separately for each set of claims. If we were to use the evidence shown above with a CPU and two GPUs, we would end up with three sets of tcb claims and we would run the policy engine three times.
We will store one AS policy for each device class. Currently we only have one AS policy, which covers all the different CPUs. We will allow additional policies to be set, one for each device class. In our example, we would run the CPU policy once, and the gpu policy twice.
Create an EAR token
Once we've run all the policies, we will use the output claims to generate an EAR appraisal for each device. In EAR, appraisals are entered into a map of submods. Each appraisal is accessed via a string key. Since EAR tokens are supposed to be decoupled from specifics of the hardware, we will use the generic device class as the key. Following our example above, we would end up with a token with three Appraisals:
cpu0
,gpu0
,gpu1
.This allows a KBS to easily check that the guest has any number of valid devices of a particular class. The KBS policy can still check the specifics of the device by looking at the endorsed claims inside the appraisal.
Resolving Timing Issues
While this proposal does not concern how attesters and verifiers are implemented, it's important to highlight one assumption. I'm assuming that devices can be attested and verified independently of one another. Notably this means that we don't try to explicitly bind the devices to the CPU. Instead, we assume that the AA is measured and we trust it to correctly attest devices (i.e. it won't take evidence from another guest with a valid device and claim that it is from this guest).
With devices, timing is a bit tricky. The KBS protocol runs lazily but devices are typically added when the guest starts. This could endanger the assumption above. If a malicious device is added to the guest before we attest, we might not be able to trust the attestation agent to correctly attest devices.
All of this is to say that this proposal is probably only sound if we have some sort of TDM component that attests the devices as they are first bound to the guest. This attestation doesn't have to be as in-depth as what we will do here. We just want to make sure that the device is valid (and won't subvert the behavior of the AA). I know there have been various concerns about timing, but I think they disappear if we attest once on bind and again via Trustee.
This double attestation does seem a bit redundant, but the only alternative I see involves some big changes to the KBS protocol. The big problem is that the TDM currently doesn't exist. We might think about whether we can make one out of the components that we already have i.e. the AA and the AS.
cc @zvonkok @Xynnn007 @sameo @imlk0
The text was updated successfully, but these errors were encountered: