Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GKE local ssd provisioner for COS #612

Merged
merged 6 commits into from
Jul 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions docs/operation-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,14 +73,18 @@ On GKE, local SSD volumes by default are limited to 375 GiB size and perform wor

For proper performance, you must:

* install the Linux guest environment, which can only be done on the Ubuntu image, not the COS image
* install the Linux guest environment on the Ubuntu image or use a recent COS image
* make sure SSD is mounted with the `nobarrier` option.

We have a [daemonset which does the above performance fixes](../manifests/gke/local-ssd-optimize.yaml).
We also have a [daemonset that fixes performance and combines all SSD disks together with lvm](../manifests/gke/local-ssd-provision.yaml).
We also have a [daemonset](../manifests/gke/local-ssd-provision.yaml) that
* fixes any performance issues
* remounts local SSD disks with a UUID for safety
* On Ubuntu combines all local SSD disks into one large disk with lvm tools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we let the user decide whether to combine the disks? In case users need to deploy multiple TiKV instances on one node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cloud I don't think it makes sense to deploy multiple TiKV instances to one node. I am more concerned about a cluster that both runs tidb-operator and something else. But it is up to them to take these daemonsets and use them appropriately.

* Run the local-volume-provisioner

The terraform deployment will automatically install that.

> **Note**: This setup that combines local SSD assumes you are running only one process that needs local SSD per VM.
> **Note**: On Ubuntu this setup that combines local SSD assumes you are running only one process that needs local SSD per VM.


## Deploy TiDB cluster
Expand Down
53 changes: 50 additions & 3 deletions manifests/gke/local-ssd-optimize.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: local-ssd-startup
name: local-ssd-startup-ubuntu
namespace: kube-system
labels:
app: local-ssd-startup
app: local-ssd-startup-ubuntu
spec:
template:
metadata:
labels:
app: local-ssd-startup
app: local-ssd-startup-ubuntu
spec:
hostPID: true
nodeSelector:
Expand Down Expand Up @@ -57,3 +57,50 @@ spec:
- name: local-ssd
hostPath:
path: /mnt/disks
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: local-ssd-startup-cos
namespace: kube-system
labels:
app: local-ssd-startup-cos
spec:
template:
metadata:
labels:
app: local-ssd-startup-cos
spec:
hostPID: true
nodeSelector:
cloud.google.com/gke-os-distribution: cos
cloud.google.com/gke-local-ssd: "true"
containers:
- name: local-ssd-startup
image: gcr.io/google-containers/startup-script:v1
securityContext:
privileged: true
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 100m
memory: 100Mi
env:
- name: STARTUP_SCRIPT
value: |
#!/usr/bin/env bash
set -euo pipefail
mount | grep -v nobarrier | awk '/ssd/{print $1}' | xargs -i mount {} -o remount,nobarrier
volumeMounts:
- mountPath: /mnt/disks
name: local-ssd
mountPropagation: Bidirectional
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- name: local-ssd
hostPath:
path: /mnt/disks
135 changes: 131 additions & 4 deletions manifests/gke/local-ssd-provision/local-ssd-provision.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,148 @@ data:
mountDir: /mnt/disks

---
# COS provisioner.
# This will not combine disks, and startup delay is minimal. Recommended if you have 1 SSD.
# Remount disks with a UUID
# Ensure the nobarrier options is set
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: local-volume-provisioner
name: local-volume-provisioner-cos
namespace: kube-system
labels:
app: local-volume-provisioner
app: local-volume-provisioner-cos
spec:
selector:
matchLabels:
app: local-volume-provisioner
app: local-volume-provisioner-cos
template:
metadata:
labels:
app: local-volume-provisioner
app: local-volume-provisioner-cos
spec:
hostPID: true
nodeSelector:
cloud.google.com/gke-os-distribution: cos
cloud.google.com/gke-local-ssd: "true"
serviceAccountName: local-storage-admin
initContainers:
- name: local-ssd-startup
image: alpine
command: ['/bin/sh', '-c', 'nsenter -t 1 -m -u -i -n -p -- bash -c "${STARTUP_SCRIPT}"']
securityContext:
privileged: true
volumeMounts:
- mountPath: /mnt/disks
name: local-disks
mountPropagation: Bidirectional
env:
- name: STARTUP_SCRIPT
value: |
#!/usr/bin/env bash
set -euo pipefail
set -x

if ! findmnt -n -a -l | grep /mnt/disks/ssd ; then
if test -f /var/ssd_mounts ; then
ssd_mounts=$(cat /var/ssd_mounts)
else
echo "no ssds mounted yet"
exit 1
fi
else
ssd_mounts=$(findmnt -n -a -l --nofsroot | grep /mnt/disks/ssd)
echo "$ssd_mounts" > /var/ssd_mounts
fi

# Re-mount all disks with a UUID
if old_mounts=$(findmnt -n -a -l --nofsroot | grep /mnt/disks/ssd) ; then
echo "$old_mounts" | awk '{print $1}' | while read -r ssd ; do
umount $ssd
done
fi
echo "$ssd_mounts" | awk '{print $1}' | while read -r ssd ; do
if test -d "$ssd"; then
rm -r "$ssd"
fi
done
devs=$(echo "$ssd_mounts" | awk '{print $2}')
echo "$devs" | while read -r dev ; do
if ! $(findmnt -n -a -l --nofsroot | grep "$dev") ; then
dev_basename=$(basename "$dev")
mkdir -p /var/dev_wiped/
if ! test -f /var/dev_wiped/$dev_basename ; then
dd if=/dev/zero of="$dev" bs=512 count=1 conv=notrunc
touch /var/dev_wiped/$dev_basename
fi
uuid=$(blkid -s UUID -o value "$dev")
mnt_dir="/mnt/disks/$uuid"
mkdir -p "$mnt_dir"
mount -U "$uuid" -t ext4 --target "$mnt_dir" --options 'rw,relatime,discard,nobarrier,data=ordered'
fi
done
containers:
- image: "quay.io/external_storage/local-volume-provisioner:v2.2.0"
name: provisioner
securityContext:
privileged: true
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 100m
memory: 100Mi
env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: JOB_CONTAINER_IMAGE
value: "quay.io/external_storage/local-volume-provisioner:v2.2.0"
volumeMounts:
- mountPath: /etc/provisioner/config
name: provisioner-config
readOnly: true
- mountPath: /mnt/disks
name: local-disks
mountPropagation: "HostToContainer"
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- name: provisioner-config
configMap:
name: local-provisioner-config
- name: local-disks
hostPath:
path: /mnt/disks

---
# Ubuntu provisioner
# This will combine disks with LVM. Recommended if you have > 1 SSD.
# Note that there is a ~2 minute startup delay to install packages.
# Remount disks with a UUID.
# Ensure the nobarrier options is set.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: local-volume-provisioner-ubuntu
namespace: kube-system
labels:
app: local-volume-provisioner-ubuntu
spec:
selector:
matchLabels:
app: local-volume-provisioner-ubuntu
template:
metadata:
labels:
app: local-volume-provisioner-ubuntu
spec:
hostPID: true
nodeSelector:
Expand Down