You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After several discussions with @mpeters@ansasaki Lukas Vrabec @galmasi and Marcus Hesse, we collectively decided that the time to have Keylime easily deployed on Kubernetes/Openshift has come. I propose we use this issue to concentrate all the relevant discussion on this topic.
I will start by listing some common relevant points, and I do thank Marcus Hesse for starting the discussion on the keylime-operator on CNCF's Slack. I believe I have addressed most of your questions on this writeup.
The main goal is to end with an "Attestation Operator", which can not only automatically add nodes (i.e., agents) to specific verifiers but can also properly react to administrative activities such as node reboots or cordoning off.
I am not an Kubernetes/Openshift expert by any means, and therefore my proposal here is bound to be incomplete/incorrect, and therefore additions/corrects are welcome. That being said, I see the following set of intermediate steps, in increasing order of complexity, as a good way to achieve our goal.
Ensure that all keylime components can be fully executed in an containerized manner. For this the following requirements should be satisfied.
a. Unmodified public images. I suggest we expand https://quay.io/organization/keylime (under Red Hat's control), already offering the "latest" verifier, registrar and tenant to also include the rust agent image (@ansasaki is pursing this)
b. Carefully determine the least amount of (container) privileges will be required to run the agent
c. Provide some tool to perform containerized keylime deployments (@maugustosilva and @galmasi have a tool, which is about to be released into open-source, to perform this task).
Create a simple Kubernetes application for keylime. At this point, we should be able to start by writing progressively more yaml files
a. The idea is to start with very simple Deployment with the following objects:
* AStatefulSet (initially of 1) for the Registrar
* AStatefulSet (initially of 1) for the Verifier
* A DaemonSet for the Agents
* Both exposed as Service (type=NodePort)
* mTLS certificates stored as Secrets
* Given the fact keylime can be fully configured via environment variables, we shall use environment dependent variables on our yaml.
b. Initially, I propose we make the following simplifying boundary conditions
* Given the use of the sqlite we could start without any DB deployment
* mTLS certificates are pre-generated (with keyime_ca commands) and added to the Kubernetes cluster
* Environment variables will be also set and maintained by some external tool
* The tenant will NOT be part of the initial deployment.
* Make use of the "Node Feature Discovery" to mark all the nodes with tpm devices (and make it part of the DaemonSet node selector)
c. From this point we should expand for an "scale-out" deployment.
* Multiple Registrars and Verifiers
* A pre-packaged helm deployment of some SQL database server will be used.
* A Service (type=LoadBalancer)
d. At this point, the following technical considerations should be made.
* I am hoping we can "get away" with a pre-packaged n-way replicated SQL DB server.
* Verifiers are identified by a "verifier ID", which I assume can be take from the "persistent identifier within a StatefulSet"
* The load balancing algorithm will have to use the URI (which contains the agent UUID) for the selection of the backend (i.e., we cannot use round-robin or source IP, given that presently a single tenant will add all the agents to the set of verifiers)
* Tenant is still considered as a component outside of the whole deployment
Create an Operator for keylime. My experience writing operators is fairly limited, but I will point out some of the desirable characteristics:
Ability to automatically generate all pertinent certificates
Ability to deal with environment variables
Ability to automatically add agents to verifiers
Ability to react to administrative tasks on node, such as reboot, drainage, cordoning off.
Make the Operator more "production-ready"
How to deal with (measured boot and runtime/IMA) policies?
How to deal with "scale-out" operations (i.e., if the number of verifier pods increase, should we perform "rebalancing")?
How to integrate "durable attestation" on this scenario?
The majority of the aforementioned stakeholders (@maugustosilva@mpeters@ansasaki Lukas Vrabec @galmasi and Marcus Hesse) voted for having this worked developed on a new repository within the keylime project. I will create such repository.
The text was updated successfully, but these errors were encountered:
After several discussions with @mpeters @ansasaki Lukas Vrabec @galmasi and Marcus Hesse, we collectively decided that the time to have Keylime easily deployed on Kubernetes/Openshift has come. I propose we use this issue to concentrate all the relevant discussion on this topic.
I will start by listing some common relevant points, and I do thank Marcus Hesse for starting the discussion on the
keylime-operator
on CNCF's Slack. I believe I have addressed most of your questions on this writeup.The main goal is to end with an "Attestation Operator", which can not only automatically add nodes (i.e.,
agents
) to specificverifiers
but can also properly react to administrative activities such as node reboots or cordoning off.I am not an Kubernetes/Openshift expert by any means, and therefore my proposal here is bound to be incomplete/incorrect, and therefore additions/corrects are welcome. That being said, I see the following set of intermediate steps, in increasing order of complexity, as a good way to achieve our goal.
Ensure that all
keylime
components can be fully executed in an containerized manner. For this the following requirements should be satisfied.a. Unmodified public images. I suggest we expand https://quay.io/organization/keylime (under Red Hat's control), already offering the "latest"
verifier
,registrar
andtenant
to also include the rustagent
image (@ansasaki is pursing this)b. Carefully determine the least amount of (container) privileges will be required to run the
agent
c. Provide some tool to perform containerized
keylime
deployments (@maugustosilva and @galmasi have a tool, which is about to be released into open-source, to perform this task).Create a simple Kubernetes application for
keylime
. At this point, we should be able to start by writing progressively moreyaml
filesa. The idea is to start with very simple
Deployment
with the following objects:* A
StatefulSet
(initially of 1) for theRegistrar
* A
StatefulSet
(initially of 1) for theVerifier
* A
DaemonSet
for theAgents
* Both exposed as
Service
(type=NodePort
)* mTLS certificates stored as
Secrets
* Given the fact
keylime
can be fully configured via environment variables, we shall use environment dependent variables on our yaml.b. Initially, I propose we make the following simplifying boundary conditions
* Given the use of the
sqlite
we could start without any DB deployment* mTLS certificates are pre-generated (with
keyime_ca
commands) and added to the Kubernetes cluster* Environment variables will be also set and maintained by some external tool
* The
tenant
will NOT be part of the initial deployment.* Make use of the "Node Feature Discovery" to mark all the nodes with
tpm
devices (and make it part of theDaemonSet
node selector)c. From this point we should expand for an "scale-out" deployment.
* Multiple
Registrars
andVerifiers
* A pre-packaged
helm
deployment of some SQL database server will be used.* A
Service
(type=LoadBalancer
)d. At this point, the following technical considerations should be made.
* I am hoping we can "get away" with a pre-packaged n-way replicated SQL DB server.
*
Verifiers
are identified by a "verifier ID", which I assume can be take from the "persistent identifier within a StatefulSet"* The load balancing algorithm will have to use the URI (which contains the
agent
UUID) for the selection of the backend (i.e., we cannot use round-robin or source IP, given that presently a singletenant
will add all theagents
to the set ofverifiers
)* Tenant is still considered as a component outside of the whole deployment
Create an
Operator
forkeylime
. My experience writing operators is fairly limited, but I will point out some of the desirable characteristics:agents
toverifiers
Make the
Operator
more "production-ready"measured boot
andruntime/IMA
) policies?verifier
pods increase, should we perform "rebalancing")?The majority of the aforementioned stakeholders (@maugustosilva @mpeters @ansasaki Lukas Vrabec @galmasi and Marcus Hesse) voted for having this worked developed on a new repository within the
keylime
project. I will create such repository.The text was updated successfully, but these errors were encountered: