-
Notifications
You must be signed in to change notification settings - Fork 462
Conversation
@@ -0,0 +1,185 @@ | |||
# This ConfigMap is used to configure a self-hosted Calico installation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we combine this into a single manifest for both single / multi nodes? I think the only difference should be the etcd_endpoints, correct? Would really like to see one manifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same, but my concern is does it matter that the file would be in the multi-node install? I'm thinking no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pointed the single node to the same file and removed the extra, spun up the single node to make sure it went ok, looks good.
@@ -149,7 +159,7 @@ EOF | |||
cat << EOF > $TEMPLATE | |||
[Unit] | |||
Requires=network-online.target | |||
After=network-online.target | |||
After=network-online.targetETCDENDPOINTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed...
# We need to overwrite this for a hosted Calico install | ||
if [ "${USE_CALICO}" = "true" ]; then | ||
export CALICO_OPTS="--volume cni-bin,kind=host,source=/opt/cni/bin \ | ||
--mount volume=cni-bin,target=/opt/cni/bin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kube-flannel uses /etc/cni, I think we should do the same thing here https://github.com/coreos/flannel/blob/master/Documentation/kube-flannel.yml#L67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks to be for the CNI conf files, not the CNI binaries correct? This is needed so our install-cni container can drop in the calico binaries into hyperkube.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, right.
v.gui = false | ||
end | ||
end | ||
|
||
config.vm.provider :virtualbox do |vb| | ||
vb.cpus = 1 | ||
vb.cpus = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary / safe assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had run into some stability issues with one, but that may have been while I was trying to run the policy controller on the master, I'll retest and and revert back if possible.
@@ -161,6 +162,9 @@ Vagrant.configure("2") do |config| | |||
|
|||
controller.vm.provision :file, :source => CONTROLLER_CLOUD_CONFIG_PATH, :destination => "/tmp/vagrantfile-user-data" | |||
controller.vm.provision :shell, :inline => "mv /tmp/vagrantfile-user-data /var/lib/coreos-vagrant/", :privileged => true | |||
|
|||
controller.vm.provision :file, :source => CALICO_MANIFEST, :destination => "/tmp/calico.yaml" | |||
controller.vm.provision :shell, :inline => "mkdir -p /srv/kubernetes/manifests && mv /tmp/calico.yaml /srv/kubernetes/manifests/", :privileged => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we're not relying on vagrant for anything beyond machine provisioning (this isn't completely true even now, but hopefully not adding more external configuration). What do you think about in-lining the calico manifest in the script (gated on USE_CALICO`)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially trying to avoid putting it into the script, but if that would be preferred I don't see a problem with that.
curl --silent -H "Content-Type: application/json" -XPOST -d"$(cat /srv/kubernetes/manifests/calico-system.json)" "http://127.0.0.1:8080/api/v1/namespaces/" > /dev/null | ||
|
||
# Deploy Calico | ||
docker run --rm --net=host -v /srv/kubernetes/manifests:/host/manifests $HYPERKUBE_IMAGE_REPO:$K8S_VER /hyperkube kubectl apply -f /host/manifests/calico.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it will get run on every boot - the kubectl create should fail "already exists", which is fine - but if the exit code propagates to the docker run, then this could cause the script to exit (I'm just not sure about the actual behavior -- ignore if this has been tested / is safe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to fail on reboot long before it gets to this point. Specifically it fails here:
local REQUIRED=('ADVERTISE_IP' 'POD_NETWORK' 'ETCD_ENDPOINTS' 'SERVICE_IP_RANGE' 'K8S_SERVICE_IP' 'DNS_SERVICE_IP' 'K8S_VER' 'HYPERKUBE_IMAGE_REPO' 'USE_CALICO') |
As ETCD_ENDPOINTS seems to be unset on reboot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But on further review it should not report a failure if it did get to this point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #775 to track this
Couple minor comments/questions |
I believe I hit all the necessary places. The kube-aws docs aren't pulled from here anymore correct? |
@@ -12,6 +12,7 @@ $update_channel = "alpha" | |||
|
|||
CLUSTER_IP="10.3.0.1" | |||
NODE_IP = "172.17.4.99" | |||
NODE_VCPUS = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this turn out to be a requirement? It's probably generally fine to do (wouldn't work on single core machine -- not sure how common that would be). If we are keeping this, can we make this configurable in the multi-node installation too.
echo "Waiting for Kubernetes API..." | ||
until curl --silent "http://127.0.0.1:8080/version" | ||
do | ||
sleep 5 | ||
done | ||
echo | ||
echo "K8S: Calico Policy" | ||
curl --silent -H "Content-Type: application/json" -XPOST -d"$(cat /srv/kubernetes/manifests/calico-system.json)" "http://127.0.0.1:8080/api/v1/namespaces/" > /dev/null | ||
docker run --rm --net=host -v /srv/kubernetes/manifests:/host/manifests $HYPERKUBE_IMAGE_REPO:$K8S_VER /hyperkube kubectl apply -f /host/manifests/calico.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, can we switch this to rkt? We require rkt already due to running the kubelet via kubelet-wrapper, but if you want to use rkt as your runtime, we do not require the use of docker.
It should be mostly equivalent, but to implement --rm
takes a little more coordination.
When executing rkt run
, add :
--uuid-file-save=/var/run/calico-install.uuid
Then after execution
/usr/bin/rkt rm --uuid-file=/var/run/calico-install.uuid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting device or resource busy
when running rm against that pod, the full set of commands:
/usr/bin/rkt run --uuid-file-save=/var/run/calico-install.uuid \
--net=host \
--volume manifests,kind=host,source=/etc/kubernetes/manifests \
--mount volume=manifests,target=/host/manifests \
$HYPERKUBE_IMAGE_REPO:$K8S_VER --exec=/hyperkube -- kubectl apply -f /host/manifests/calico.yaml
/usr/bin/rkt rm --uuid-file=/var/run/calico-install.uuid
Any suggestions? Should I just leave the pod in an exited state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one other trick that might be worth trying is rkt status --wait $uuid
between the run
and rm
If a wait
doesn't help, I'd be happy to dig a bit deeper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregard my last comment, I remembered the actual issue here. This is caused by a shared bind mount of /var/lib/rkt
(created by the rkt fly kubelet) interacting poorly with rkt rm
: rkt/rkt#3181
There's unfortunately not a good fix yet 😦
Couple more minor comments. /cc @robszumski and @joshix for docs changes |
Based on @euank feedback (#768 (comment)) -- sorry to ask this, but prob can just switch back to using docker for now and add a TODO for switching to rkt once the issue is resolved |
No problem, thanks for the fast feedback :) |
@heschlie can you squash the cleanup commits / where appropriate and we can get this merged? |
--volume cni-bin,kind=host,source=/opt/cni/bin \ | ||
--mount volume=cni-bin,target=/opt/cni/bin | ||
``` | ||
- Add `ExecStartPre=/usr/bin/mkdir -p /opt/cni/bin` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish there was a cleaner way to describe all of this, but I'm at a loss
* Connects containers to the flannel overlay network, which enables the "one IP per pod" concept. | ||
* Enforces network policy created through the Kubernetes policy API, ensuring pods talk to authorized resources only. | ||
|
||
When creating `/etc/systemd/system/calico-node.service`: | ||
Finally policy agent is the last major piece of the calico.yaml. It monitors the API for changes related to network policy and configures Calico to implement that policy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can drop "finally" here.
The policy agent is the last major piece of the Calico system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should also read "The policy controller"
commits squashed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@heschlie looks like there are some conflicts now. Can you rebase your changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing approval until conflicts resolved
@aaronlevy and done! |
@@ -31,11 +31,19 @@ export K8S_SERVICE_IP=10.3.0.1 | |||
export DNS_SERVICE_IP=10.3.0.10 | |||
|
|||
# Whether to use Calico for Kubernetes network policy. | |||
export USE_CALICO=false | |||
export USE_CALICO=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we leave this as default off for now. It's also default to off in multi-node installations and I'd like to not also change default behavior in this large of a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep sorry about that it should have been set to false, must have missed it, amended the last commit.
Thanks. One last minor comment to address - then squash and we should be good. |
Perfect. Can you squash that last commit |
|
||
```sh | ||
$ curl -H "Content-Type: application/json" -XPOST -d'{"apiVersion":"v1","kind":"Namespace","metadata":{"name":"kube-system"}}' "http://127.0.0.1:8080/api/v1/namespaces" | ||
>>>>>>> 625c7d7... Updating Calico to hosted install |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some rebase cruft left here.
In order to get the hosted intall to work we had to make some changes to the deployment, the most disruptive being that we need to write to the hyperkubes `/opt/cni/bin` directory, so when Calico is enabled we mount that directory to the host. Some other minor changes were adding the pod-cidr to the proxy pods to get around some issues with Vagrant networking We use docker to run hyperkube with kubectl to deploy the calico.yaml manifest, this should be changed to rkt once the noted bug is fixed Removed all systemd Calico configuration Using one CNI conf for all nodes Updated bare metal docs to self-hosted Calico install Updated upgrade doc, noted this new install is not compatible with old systemd install method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
In order to get the hosted intall to work we had to make some changes to the deployment, the most disruptive being that we need to write to the hyperkubes
/opt/cni/bin
directory, so when Calico is enabled we mount that directory to the host. Our calico/cni:v1.5.2 image deploys flannel, host-local, and loopback CNI binaries alongside the calico binaries.Some other minor changes were:
We will want to update the remaining docs to use the same changes, but I wanted this reviewed before going through that process.