-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Keylime easily deployable on Kubernetes/Openshift #1
Comments
Listing the goal/purpose of the operator is a great idea. We should place this in the README for everyone immediately to see.
@ansasaki are you actively working on this? if not, this is a good task for me to take on.
@maugustosilva I assume this is for containerized deployments outside of Kubernetes?
I like the idea of the initial boundary conditions, it will make it a lot easier to make progress. Here are some questions/comments I have:
These are the $100 questions :)
@maugustosilva if you don't mind, I would start to create issues for at least some of the work that you are proposing here, so that I can get started to work on them? |
Hey @mheese, trying to answer a few of the questions here, but will most definitely start to fold it out into multiple issues:
|
… dependency in the main keylime chart. Signed-off-by: George Almasi <gheorghe@us.ibm.com>
After several discussions with @mpeters @ansasaki Lukas Vrabec @galmasi and Marcus Hesse, we collectively decided that the time to have Keylime easily deployed on Kubernetes/Openshift has come. I propose we use this issue to concentrate all the relevant discussion on this topic.
I will start by listing some common relevant points, and I do thank Marcus Hesse for starting the discussion on the
keylime-operator
on CNCF's Slack. I believe I have addressed most of your questions on this writeup.The main goal is to end with an "Attestation Operator", which can not only automatically add nodes (i.e.,
agents
) to specificverifiers
but can also properly react to administrative activities such as node reboots or cordoning off.I am not an Kubernetes/Openshift expert by any means, and therefore my proposal here is bound to be incomplete/incorrect, and therefore additions/corrects are welcome. That being said, I see the following set of intermediate steps, in increasing order of complexity, as a good way to achieve our goal.
Ensure that all
keylime
components can be fully executed in an containerized manner. For this the following requirements should be satisfied.a. Unmodified public images. I suggest we expand https://quay.io/organization/keylime (under Red Hat's control), already offering the "latest"
verifier
,registrar
andtenant
to also include the rustagent
image (@ansasaki is pursing this)b. Carefully determine the least amount of (container) privileges will be required to run the
agent
c. Provide some tool to perform containerized
keylime
deployments (@maugustosilva and @galmasi have a tool, which is about to be released into open-source, to perform this task).Create a simple Kubernetes application for
keylime
. At this point, we should be able to start by writing progressively moreyaml
filesa. The idea is to start with very simple
Deployment
with the following objects:* A
StatefulSet
(initially of 1) for theRegistrar
* A
StatefulSet
(initially of 1) for theVerifier
* A
DaemonSet
for theAgents
* Both exposed as
Service
(type=NodePort
)* mTLS certificates stored as
Secrets
* Given the fact
keylime
can be fully configured via environment variables, we shall use environment dependent variables on our yaml.b. Initially, I propose we make the following simplifying boundary conditions
* Given the use of the
sqlite
we could start without any DB deployment* mTLS certificates are pre-generated (with
keyime_ca
commands) and added to the Kubernetes cluster* Environment variables will be also set and maintained by some external tool
* The
tenant
will NOT be part of the initial deployment.* Make use of the "Node Feature Discovery" to mark all the nodes with
tpm
devices (and make it part of theDaemonSet
node selector)c. From this point we should expand for an "scale-out" deployment.
* Multiple
Registrars
andVerifiers
* A pre-packaged
helm
deployment of some SQL database server will be used.* A
Service
(type=LoadBalancer
)d. At this point, the following technical considerations should be made.
* I am hoping we can "get away" with a pre-packaged n-way replicated SQL DB server.
*
Verifiers
are identified by a "verifier ID", which I assume can be take from the "persistent identifier within a StatefulSet"* The load balancing algorithm will have to use the URI (which contains the
agent
UUID) for the selection of the backend (i.e., we cannot use round-robin or source IP, given that presently a singletenant
will add all theagents
to the set ofverifiers
)* Tenant is still considered as a component outside of the whole deployment
Create an
Operator
forkeylime
. My experience writing operators is fairly limited, but I will point out some of the desirable characteristics:agents
toverifiers
Make the
Operator
more "production-ready"measured boot
andruntime/IMA
) policies?verifier
pods increase, should we perform "rebalancing")?The majority of the aforementioned stakeholders (@maugustosilva @mpeters @ansasaki Lukas Vrabec @galmasi and Marcus Hesse) voted for having this worked developed on a new repository within the
keylime
project. I will create such repository.The text was updated successfully, but these errors were encountered: