Skip to content

Commit

Permalink
Merge pull request #239 from amazonlinux/dogswatch
Browse files Browse the repository at this point in the history
dogswatch: initial kubernetes operator
  • Loading branch information
patraw committed Nov 15, 2019
2 parents 7342e4e + c8c4a0f commit 5d86c74
Show file tree
Hide file tree
Showing 51 changed files with 4,230 additions and 0 deletions.
5 changes: 5 additions & 0 deletions extras/dogswatch/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.direnv/
*.nix
.envrc
*.el
*.tar*
11 changes: 11 additions & 0 deletions extras/dogswatch/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# syntax=docker/dockerfile:experimental
FROM golang:1.13 as builder
ENV GOPROXY=direct
COPY ./ /go/src/github.com/amazonlinux/thar/dogswatch/
RUN cd /go/src/github.com/amazonlinux/thar/dogswatch && \
CGO_ENABLED=0 GOOS=linux go build -o dogswatch . && mv dogswatch /dogswatch

FROM scratch
COPY --from=builder /dogswatch /etc/ssl /
ENTRYPOINT ["/dogswatch"]
CMD ["-help"]
47 changes: 47 additions & 0 deletions extras/dogswatch/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
GOPKG = github.com/amazonlinux/thar/dogswatch
GOPKGS = $(GOPKG) $(GOPKG)/pkg/... $(GOPKG)/cmd/...
GOBIN = ./bin/
DOCKER_IMAGE := dogswatch
DOCKER_IMAGE_REF := $(DOCKER_IMAGE):$(shell git describe --always --dirty)

build: $(GOBIN)
cd $(GOBIN) && \
go build -v -x $(GOPKG) && \
go build -v -x $(GOPKG)/cmd/...

$(GOBIN):
mkdir -p $(GOBIN)

test:
go test -ldflags '-X $(GOPKG)/pkg/logging.DebugEnable=true' $(GOPKGS)

container: vendor
docker build --network=host -t $(DOCKER_IMAGE_REF) .

load: container
kind load docker-image $(DOCKER_IMAGE)

vendor: go.sum go.mod
CGO_ENABLED=0 GOOS=linux go mod vendor
touch vendor/

deploy:
sed 's,@containerRef@,$(DOCKER_IMAGE_REF),g' ./dev/deployment.yaml \
| kubectl apply -f -

rollout: deploy
kubectl -n thar rollout restart deployment/dogswatch-controller
kubectl -n thar rollout restart daemonset/dogswatch-agent

rollout-kind: load rollout

cluster:
kind create cluster --config ./dev/cluster.yaml

dashboard:
kubectl apply -f ./dev/dashboard.yaml
@echo 'Visit dashboard at: http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/'
kubectl proxy

get-nodes-status:
kubectl get nodes -o json | jq -C -S '.items| map({(.metadata.name): (.metadata.labels * .metadata.annotations)})'
105 changes: 105 additions & 0 deletions extras/dogswatch/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Dogswatch: Update Operator

Dogswatch is a [Kubernetes operator](https://Kubernetes.io/docs/concepts/extend-Kubernetes/operator/) that coordinates update activities on Thar hosts in a Kubernetes cluster.

## How to Run on Kubernetes


To run the Dogswatch Operator in your Kubernetes cluster, the following are required resources and configuration (examples given in the [./dev/deployment.yaml](./dev/deployment.yaml) template):

- **`dogswatch` Container Image**

Holding the Dogswatch binaries and its supporting environment.

- **Controller Deployment**

Scheduling a stop-restart-tolerant Controller process on available Nodes.

- **Agent DaemonSet**

Scheduling Agent on Thar hosts

- **Thar Namespace**

Grouping Thar related resources and roles.

- **Service Account for the Agent**

Configured for authenticating the Agent process on Kubernetes APIs.

- **Cluster privileged credentials with read-write access to Nodes for Agent**

Applied to Agent Service Account to update annotations on the Node resource that the Agent is running under.

- **Service Account for the Controller**

Configured for authenticating the Controller process on Kubernetes APIs.

- **Cluster privileged credentials with access to Pods and Nodes for Controller**

Applied to the Controller Service Account for manipulating annotations on Node resources as well as cordon & uncordoning for updates.
The Controller also must be able to un-schedule (`delete`) Pods running on Nodes that will be updated.

In the [./dev/deployment.yaml example](./dev/deployment.yaml), the resource specifies the conditions that the Kubernetes Schedulers will place them in the Cluster.
These conditions include the Node being labeled as having the required level of support for the Operator to function on it: the `thar.amazonaws.com/platform-version` label.
With this label present and the workloads scheduled, the Agent and Controller process will coordinate an update as soon as the Agent annotates its Node (by default only one update will happen at a time).

To use the example [./dev/deployment.yaml](./dev/deployment.yaml) as a base, you must modify the resources to use the appropriate container image that is available to your kubelets (a common image is forthcoming, see #505).
Then with a appropriately configured deployment yaml, you may call `kubelet apply -f ./my-deployment.yaml` to prepare the above resources and schedule the Dogswatch Pods in your Cluster.

## What Makes Up Dogswatch

Dogswatch is made up of two distinct processes, one of which runs on each host.

- `dogswatch -controller`

The coordinating process responsible for the handling update of Thar nodes
cooperatively with the cluster's workloads.

- `dogswatch -agent`

The on-host process responsible for publishing update metadata and executing
update activities.

## How It Coordinates

The Dogswatch processes communicate by applying updates to the Kubernetes Node resources' Annotations.
The Annotations are used to communicate the Agent activity (called an `intent`) as determined by the Controller process, the current Agent activity in response to the intent, and the Host's update status
as known by the Agent process.

The Agent and Controller processes listen to an event stream from the Kubernetes cluster in order to quickly and reliably handle communicated `intent` in addition to updated metadata pertinent to updates and the Operator itself.

### Current Limitations

- Pod replication & healthy count is not taken into consideration (#502)
- Nodes update without pause between each (#503)
- Single Node cluster degrades into unscheduleable on update (#501)
- Node labels are not automatically applied to allow scheduling (#504)

## How to Contribute and Develop Changes for Dogswatch

Working on Dogswatch requires a fully functioning Kubernetes cluster.
For the sake of development workflow, you may easily run this within a container or VM as with [`kind`](https://github.com/Kubernetes-sigs/kind) or [`minikube`](https://github.com/Kubernetes/minikube).
The `dev/` directory contains several resources that may be used for development and debugging purposes:

- `dashboard.yaml` - A **development environment** set of Kubernetes resources (these use insecure settings and *are not suitable for use in Production*!)
- `deployment.yaml` - A _template_ for Kubernetes resources for Dogswatch that schedule a controller and setup a DaemonSet
- `kind-cluster.yml` - A `kind` Cluster definition that may be used to stand up a local development cluster

Much of the development workflow can be accommodated by the `Makefile` providedalongside the code.
Each of these targets utilize your existing environment and tools - for example: your `kubectl` as configured will be used.
If you have locally configured access to production, please ensure you've taken steps to reconfigure or otherwise cause `kubectl` to affect only your development cluster.

**General use targets**

- `container` - build a container image used by the Kubernetes resources
- `dashboard` - create or update Kubernetes-dashboard (*not suitable for use in Production*)
- `deploy` - create or update Dogswatch Kubernetes resources
- `rollout` - reload and restart Dogswatch processes in the cluster
- `test` - run `go test` against `dogswatch`

**`kind` development targets**

- `load`
- `cluster`
- `rollout-kind`
9 changes: 9 additions & 0 deletions extras/dogswatch/cmd/dogswatch-platform/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
package main

import (
"github.com/amazonlinux/thar/dogswatch/pkg/marker"
)

func main() {
println(marker.PlatformVersionBuild)
}
163 changes: 163 additions & 0 deletions extras/dogswatch/dev/dashboard.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# ------------------- Dashboard Secret ------------------- #

apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kube-system
type: Opaque

---
# ------------------- Dashboard Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system

---
# ------------------- Dashboard Role & Role Binding ------------------- #

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
rules:
# Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create"]
# Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create"]
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
verbs: ["get"]

---
# ------------------- Dashboard Deployment ------------------- #

kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
- --enable-skip-login
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule

---
# ------------------- Dashboard Service ------------------- #

kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system

Loading

0 comments on commit 5d86c74

Please sign in to comment.