Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Kubernetes service deployer #239

Closed
mtojek opened this issue Feb 1, 2021 · 19 comments · Fixed by elastic/integrations#717
Closed

Support Kubernetes service deployer #239

mtojek opened this issue Feb 1, 2021 · 19 comments · Fixed by elastic/integrations#717

Comments

@mtojek
Copy link
Contributor

mtojek commented Feb 1, 2021

Follow-up on: #89

This issue might be blocked by lack of recommendation for the managed fleet and Kubernetes cluster. Not sure if it's a temporary situation, but maybe we need to support unmanaged mode as well.

I suppose that technical details depend on the decision above (managed vs unmanaged).

cc @ChrsMark @ycombinator

@ChrsMark
Copy link
Member

ChrsMark commented Feb 1, 2021

@mtojek how about supporting both cases (managed==fleet + unmanaged==standalone)? To my mind, standalone version will be a way to go for many people out there doing k8s ops based on infra as code principles, and hence I expect us having both ways supported.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 1, 2021

We may consider to support both modes, maybe even break it into two iterations.

Speaking about unmanaged - let's build the design together:

  1. Installed kubectl and cluster-context configured.
    The version/format of kubectl/kubeconfig seems to be important, not sure if we can can replace it with a docker image for kubectl. With kubectl you can select the target cluster, either local one deployed with kind or remote one.
    This way we can easily use GCP clusters (gcloud container clusters get-credentials cluster-name will re–configure the kubectl).

  2. _dev/deploy/k8s/: place to store deployment.yaml files, describing the standalone Elastic-Agent deployment (with configuration).

  3. System test runner uses kubectl to deploy the Elastic Agent (standalone) in the Kubernetes cluster. I'd like to stick to the elastic-package test command and not introduce anything extra like test-on-k8s, test --kubernetes, test --unmanaged etc.

Question:

  1. Where should be placed the Elastic stack? New one on Kubernetes? Reuse of the Docker-based stack? I'm not sure if we can connect the Elastic Agent in the Kubernetes cluster to the Kibana deployed in the Docker container (IP/routing).

@mtojek
Copy link
Contributor Author

mtojek commented Feb 1, 2021

Speaking about managed:

I would assume that the Elastic Agent is deployed as part of the Elastic stack and we pass all required configuration options/secrets. It's not distributed as DaemonSet, but it can reach out to the cluster via HTTP API.

This method might be easier to implement in the elastic-package, but it's not the original way of running the Agent.

@ChrsMark
Copy link
Member

ChrsMark commented Feb 1, 2021

We may consider to support both modes, maybe even break it into two iterations.

Speaking about unmanaged - let's build the design together:

  1. Installed kubectl and cluster-context configured.
    The version/format of kubectl/kubeconfig seems to be important, not sure if we can can replace it with a docker image for kubectl. With kubectl you can select the target cluster, either local one deployed with kind or remote one.
    This way we can easily use GCP clusters (gcloud container clusters get-credentials cluster-name will re–configure the kubectl).
  2. _dev/deploy/k8s/: place to store deployment.yaml files, describing the standalone Elastic-Agent deployment (with configuration).
  3. System test runner uses kubectl to deploy the Elastic Agent (standalone) in the Kubernetes cluster.

Sounds good to me more or less. We have something similar in Beats started in elastic/beats#17538 and maybe we can borrow some ideas from. cc: @jsoriano

I guess that after deploying Agent we will check that data actually flows in, right?

Question:

  1. Where should be placed the Elastic stack? New one on Kubernetes? Reuse of the Docker-based stack? I'm not sure if we can connect the Elastic Agent in the Kubernetes cluster to the Kibana deployed in the Docker container (IP/routing).

That's true and this is the reason I actually had to deploy ES stack on k8s too in the past. From my personal experience on developing for k8s, I don't think we have an easy way to make it work with Docker-based stack and I'm afraid that messing up k8s networking with Docker networking wouldn't be a good idea (but maybe i'm mistaken here so no strong opinion).

Also we have another prerequisite which is kube-state-metrics so as to have state_* datasets collect metrics.

@ChrsMark
Copy link
Member

ChrsMark commented Feb 1, 2021

Speaking about managed:

I would assume that the Elastic Agent is deployed as part of the Elastic stack and we pass all required configuration options/secrets. It's not distributed as DaemonSet, but it can reach out to the cluster via HTTP API.

This method might be easier to implement in the elastic-package, but it's not the original way of running the Agent.

Hmm, connecting to k8s API server should be doable but not sure if connecting to Kubelet's APIs would be possible though from the outside world. I think we miss information on how the managed approach would look like though.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 1, 2021

I guess that after deploying Agent we will check that data actually flows in, right?

That is correct, it's the standard verification flow.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 3, 2021

Ok, let me reiterate on this topic.

Agree, we'll achieve better/more natural testing experience if the Elastic Agent is deployed on the real (or minikube) Kubernetes cluster. Here is another approach, which may cover the managed fleet (we need to find a compromise):

Prerequisites:

  • minikube, we need to manage not only k8s deployments, but exposing services as well (I think it's not doable with kubectl only)

Notice:

  • we're focused on testing and validation of the integration, not the purity of Agent's deployment

Steps:

  1. elastic-package will manage another stack (minikube based):
    elastic-package stack up --engine kubernetes

    The command will deploy ES, Kibana, Package Registry, Elastic Agent on Kubernetes using embedded YAML template. It exposes also Kubernetes, Elasticsearch externally as services (accessible to hosts). We need to deploy all parts of Elastic stack, because the Elastic Agent must reach out to ES and Kibana.

  2. Export variables for the elastic-package to access ES and Kibana services.

  3. Define "null" service provider - we don't have any service (under tests) to be deployed, single test case config would be enough.

In the future we may here the kubernetes service deployer, which can install additional software in the k8s cluster.

  1. System test provider will select test config, prepare.a policy, assign to the Elastic Agent and wait for results.

cc @ChrsMark @ycombinator I appreciate your comments on this idea.

@ChrsMark
Copy link
Member

ChrsMark commented Feb 3, 2021

Thanks @mtojek, it looks good to me!

Some questions, since I miss some of elastic-package internals:

  1. Will Agent be enrolled in Kibana when it is deployed?
  2. How the system tests of k8s package will start? Will elastic-package somehow trigger the package installation using the APIs?

@mtojek
Copy link
Contributor Author

mtojek commented Feb 3, 2021

Will Agent be enrolled in Kibana when it is deployed?

Yes. Once the agent's Docker image is started, this entrypoint kicks off and performs enrolling:

https://github.com/elastic/beats/blob/master/dev-tools/packaging/templates/docker/docker-entrypoint.elastic-agent.tmpl

How the system tests of k8s package will start? Will elastic-package somehow trigger the package installation using the APIs?

elastic-package test system - nothing changes here. The Kibana and Elasticsearch are exposed from minikube and accessible to the elastic-package via global envs. System test runner will know the Kibana endpoint and install the kubernetes package using Kibana APIs. Same for assigning the policy.

EDIT:

I have concerns regarding introduction of the minikube, not sure about its stability (maybe kind is better choice?`). I'm ok to any solution which can expose Elasticsearch and Kibana endpoints externally.

EDIT2:

@ChrsMark Is it possible to run kind Docker image (node image) and share all required endpoints (kubelet, api server, etc)?

@ycombinator
Copy link
Contributor

@mtojek I think your proposal makes sense. I just have one question: with the introduction of --engine, there's nothing preventing a user from spinning up the stack on k8s and then running system tests on any arbitrary integration, e.g. apache. This should still work, right?

@mtojek
Copy link
Contributor Author

mtojek commented Feb 3, 2021

As long as the Elasticsearch and Kibana are accessible and the Package Registry contains the package, it's expected to work. We're just changing the docker-compose to Kubernetes.

@ChrsMark
Copy link
Member

ChrsMark commented Feb 3, 2021

@mtojek I ve no strong opinion about kind or minikube, we have used kind in Beats for integrations tests. I ve used both for my personal env to do Beats develoment, however while developing locally for k8s I've found some inconsistencies (do not remember what specifically) with kind but maybe it was just my case so no provable point. One thing we might need to check is the support of different runtimes (docker, containerd) since it would be nice in the future to be able to test against different container runtimes specially for logs.

Endpoints:

  1. kube-state-metrics: http://kube-state-metrics:8080 (under /metrics). It requires kube-state-metrics to be installed under same namespace. We can use https://github.com/kubernetes/kube-state-metrics/tree/master/examples/standard to install the service.
  2. apiserver: https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT} (under /metrics)
  3. kubelet: https://${NODE_NAME}:10250 (under /stas/summary)
  4. k8s node proxy: http://localhost:10249 (under /metrics)
  5. scheduler on master node: http://localhost:10251 (under /metrics)
  6. controllermanager on master node: http://localhost:10252 (under /metrics)

KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT are env vars exposed in the Pod by default while NODE_NAME is passed with downwards api:

env:
  - name: NODE_NAME
     valueFrom:
       fieldRef:
         fieldPath: spec.nodeName

See the example at https://github.com/elastic/beats/pull/23679/files#diff-7896a70414721b8d0b3d8b90808b92c750d40c56bdf2ad01bf629c9499cde64eR38

@mtojek @ycombinator will apache service be running as Pod? If it is doable then it brings great opportunities to support k8s-native apps like istio, ingress-controller, coredns etc

@ycombinator
Copy link
Contributor

As long as the Elasticsearch and Kibana are accessible and the Package Registry contains the package, it's expected to work. We're just changing the docker-compose to Kubernetes.

Just trying to think through how the Docker Compose service deployer (which would be used to spin up the Apache service container when elastic-package test system is run) would work with the stack deployed on k8s, especially with the log folder copying we have going on. I'm okay with proceeding with a PR and seeing how this works out or any adjustments we need to make (e.g. the service deployer may need to become aware of what engine is running the stack to do the right thing).

@mtojek
Copy link
Contributor Author

mtojek commented Feb 3, 2021

@mtojek @ycombinator will apache service be running as Pod? If it is doable then it brings great opportunities to support k8s-native apps like istio, ingress-controller, coredns etc

Yes, in this case it would be possible, but...

how the Docker Compose service deployer would work with the stack deployed on k8s, especially with the log folder copying we have going on.

I don't have solution/idea for log collection, but I believe it's solvable too. In fact we don't need the Docker Compose service deployer, because we don't have a service (talking only about monitoring Kubernetes API, it's similar to the system integration). What we need here is "null" service deployer - something that doesn't deploy any service.

EDIT:

In the mean time I tried to reverse kind a bit and managed to run this (alternative solution):

docker run --hostname kind-control-plane --name kind-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro --detach --tty --label io.x-k8s.kind.cluster=kind --net kind --restart=on-failure:1 --publish=127.0.0.1:61716:6443/TCP kindest/node:v1.20.2

manually write /kind/kubeadm.conf:

apiServer:
  certSANs:
  - localhost
  - 127.0.0.1
  extraArgs:
    runtime-config: ""
apiVersion: kubeadm.k8s.io/v1beta2
clusterName: kind
controlPlaneEndpoint: kind-control-plane:6443
controllerManager:
  extraArgs:
    enable-hostpath-provisioner: "true"
kind: ClusterConfiguration
kubernetesVersion: v1.20.2
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/16
scheduler:
  extraArgs: null
---
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- token: abcdef.0123456789abcdef
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.224.2
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    fail-swap-on: "false"
    node-ip: 192.168.224.2
    provider-id: kind://docker/kind/kind-control-plane
---
apiVersion: kubeadm.k8s.io/v1beta2
controlPlane:
  localAPIEndpoint:
    advertiseAddress: 192.168.224.2
    bindPort: 6443
discovery:
  bootstrapToken:
    apiServerEndpoint: kind-control-plane:6443
    token: abcdef.0123456789abcdef
    unsafeSkipCAVerification: true
kind: JoinConfiguration
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    fail-swap-on: "false"
    node-ip: 192.168.224.2
    provider-id: kind://docker/kind/kind-control-plane
---
apiVersion: kubelet.config.k8s.io/v1beta1
evictionHard:
  imagefs.available: 0%
  nodefs.available: 0%
  nodefs.inodesFree: 0%
imageGCHighThresholdPercent: 100
kind: KubeletConfiguration
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
iptables:
  minSyncPeriod: 1s
kind: KubeProxyConfiguration
mode: iptables

Initialize:

kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6

Try with curl:

curl --cacert /etc/kubernetes/pki/ca.crt --key /etc/kubernetes/pki/apiserver-kubelet-client.key --cert /etc/kubernetes/pki/apiserver-kubelet-client.crt -k https://localhost:10250/pods -s | wc -c
22452

Works :) I bet that the hardest part is to figure out the networking properties.

@ycombinator
Copy link
Contributor

ycombinator commented Feb 3, 2021

I'm talking about the Docker Compose service deployer that will be used if we are system testing the apache package when the stack has been spun up with --engine kubernetes. I have no doubt that system testing the kubernetes package will work when the stack has been spun up with --engine kubernetes.

[EDIT] I think the solution to this may lie in making the service deployers aware of the engine the stack is running with, so as to make them "smarter" about how/where they deploy the service, e.g. as @ChrsMark alluded to in his comment about the Apache service running as a pod.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 3, 2021

Honestly I would bind a service deployer with an engine it supports. I wouldn't like to implement an adapter which can transform Compose files to pod definitions.

EDIT:

Support matrix?

@ycombinator
Copy link
Contributor

ycombinator commented Feb 3, 2021

I wouldn't like to implement an adapter which can transform Compose files to pod definitions.

You mean like this: https://kubernetes.io/docs/tasks/configure-pod-container/translate-compose-kubernetes/? 😉

Honestly I would bind a service deployer with an engine it supports.

++ I'm good with restricting certain service deployers to certain stack deployment engines. At the very least, I think this is a fine starting point, since there isn't a use case for system testing the same integration in multiple stack deployment engines AFAIK. We just need to make sure the unsupported cases are handled well with good error messaging, etc.

@mtojek
Copy link
Contributor Author

mtojek commented Feb 4, 2021

I spent some time wiring Docker Compose definitions for Kubernetes:
https://github.com/mtojek/integrations/tree/try-without-kind/packages/kubernetes/_dev/deploy/docker

and managed to run the control node a Docker container. Similar actions are performed by kind. I wonder if it can replace existing Prometheus mocks.

I know we would need to introduce some adjustments in the elastic-package:

  • add --renew-anon-volumes flag
  • check if seccomp and privileged mode are available in CI
  • share volume with credentials between agent and control node

but maybe we don't need to use kind or minikube at all (these are just wrappers either to Docker compose or VM). I didn't go deeper with worker nodes, but would like to hear your (@ChrsMark @ycombinator) feedback (to be convinced that our decision path is correct). This looks promising if we decide to create a similar executor like Terraform, Kubernetes service deployer would be responsible for deploying the Kubernetes stack in ca. 30s) and then apply custom user definitions.

EDIT:

I see it's pending on kube-proxy to fully boot up (that's why no coredns pods are present), I believe it's networking issue.

EDIT2:

I figured out this. Replace the Pod placeholder in /kind/manifests/default-cni.yaml) and then:

kubectl create --kubeconfig=/etc/kubernetes/admin.conf -f /kind/manifests/default-cni.yaml

It looks like we don't need kind at all, but can follow same actions as here: https://github.com/kubernetes-sigs/kind/tree/master/pkg/cluster/internal/create/actions

EDIT3:

I deployed nginx application:

kubectl apply -f https://k8s.io/examples/application/deployment.yaml

kubectl get pods -n default -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP           NODE                 NOMINATED NODE   READINESS GATES
nginx-deployment-66b6c48dd5-ldg44   1/1     Running   0          26s   10.244.0.5   kind-control-plane   <none>           <none>
nginx-deployment-66b6c48dd5-tcvmm   1/1     Running   0          26s   10.244.0.4   kind-control-plane   <none>           <none>

ps aux | grep nginx
root        6278  0.0  0.0  32628  5188 ?        Ss   17:31   0:00 nginx: master process nginx -g daemon off;
_rpc        6292  0.0  0.0  33076  2548 ?        S    17:31   0:00 nginx: worker process
root        6311  0.0  0.0  32628  5204 ?        Ss   17:31   0:00 nginx: master process nginx -g daemon off;
_rpc        6324  0.0  0.0  33076  2476 ?        S    17:31   0:00 nginx: worker process
root        6926  0.0  0.0   3276   884 pts/0    S+   17:34   0:00 grep --color=auto nginx

@mtojek
Copy link
Contributor Author

mtojek commented Feb 5, 2021

I think I've finished researching/PoC this area and would stick to kind.

The kind tool uses Docker to boot up a node and let us manipulate networks to connect to the elastic-package-stack, which means that we don't have to spawn Elasticsearch or Kibana in the k8s cluster:

root@kind-control-plane:/# kubectl get pods
NAME         READY   STATUS    RESTARTS   AGE
shell-demo   1/1     Running   0          26m
root@kind-control-plane:/# kubectl exec --stdin --tty shell-demo -- /bin/bash
# curl elastic-package-stack_elasticsearch_1.elastic-package-stack_default:9200
{"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","ApiKey"]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","ApiKey"]}},"status":401}root@kind-control-plane:/#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants