split controller and apiserver start #14775

deads2k · 2017-06-20T14:04:24Z

This builds on @mfojtik's refactoring pull and separates the construction of the controller part of the process from the apiserver part of the process. More will be required, but this gets the informers under control and moves us in the correct direction.

@liggitt

deads2k · 2017-06-20T14:04:37Z

oh joy, conflict

deads2k · 2017-06-20T14:41:00Z

[test]

deads2k · 2017-06-20T15:21:25Z

@stevekuznetsov how hard would it be to have verify failures fail the job, but allow the unit tests to run anyway?

stevekuznetsov · 2017-06-20T15:31:57Z

Right now the job runs two separate shell scripts for the verify and test steps, but if we squashed them together into one stage we could probably Bash our way around it.

deads2k · 2017-06-21T12:07:16Z

re[test]

mfojtik · 2017-06-21T12:44:11Z

pkg/cmd/server/kubernetes/master/controller/config.go

+	// TODO once the cloudProvider moves, move the configs out of here to where they need to be constructed
+	persistentVolumeController := PersistentVolumeControllerConfig{
+		RecyclerImage: c.RecyclerImage,
+		// TODO: In 3.7 this is renamed to 'Cloud' and is part of kubernetes ControllerContext


guess we can remove this todos as you have todo in func godoc (which btw. should be in godoc format ;-)

mfojtik · 2017-06-21T12:47:20Z

pkg/cmd/server/origin/master_config.go

+		ClientEnvVars: vars,
+	}
+	ret.DeploymentConfigControllerConfig = origincontrollers.DeploymentConfigControllerConfig{
+		Codec: annotationCodec,


did we lost todo about moving codec to controller context? (maybe i never added that todo ;-)

did we lost todo about moving codec to controller context? (maybe i never added that todo ;-)

Having now gone through it, we don't want it generically available. It turns out that it is being used improperly and will cause us versioning pain. We should strive to eliminate it instead.

fine with that

mfojtik · 2017-06-21T12:48:02Z

pkg/cmd/server/origin/master_config.go

+	}
+	ret.ImageImportControllerOptions = origincontrollers.ImageImportControllerOptions{
+		MaxScheduledImageImportsPerMinute:          options.ImagePolicyConfig.MaxScheduledImageImportsPerMinute,
+		ResyncPeriod:                               10 * time.Minute,


wonder if we want to make the resyncPeriods configurable or we can just hardcode them in controllers

wonder if we want to make the resyncPeriods configurable or we can just hardcode them in controllers

Ultimately, they should be configurable. This is a move of existing hardcodedness.

mfojtik · 2017-06-21T12:48:17Z

pkg/cmd/server/origin/master_config.go

+		HasStatefulSetsEnabled: options.DisabledFeatures.Has("triggers.image.openshift.io/statefulsets"),
+		HasCronJobsEnabled:     options.DisabledFeatures.Has("triggers.image.openshift.io/cronjobs"),
+	}
+	ret.ImageImportControllerOptions = origincontrollers.ImageImportControllerOptions{


s/ImageImportControllerOptions/ImageImportControllerConfig/ for consistency

mfojtik · 2017-06-21T12:48:41Z

pkg/cmd/server/origin/master_config.go

+	}
+
+	ret.OriginToRBACSyncControllerConfig = origincontrollers.OriginToRBACSyncControllerConfig{
+		PrivilegedRBACClient: kubeInternal.Rbac(),


add comment about why this need privileged?

mfojtik · 2017-06-21T12:49:38Z

pkg/cmd/server/origin/run_components.go

+	//c.Options.ProjectConfig.SecurityAllocator
+	SecurityAllocator *configapi.SecurityAllocator
+
+	//c.RESTOptionsGetter


nuke this and above (i was doing the same when I was moving these ;-)

mfojtik · 2017-06-21T12:51:49Z

LGTM (with some nits)

also great stuff!

deads2k · 2017-06-21T19:41:40Z

re[test]

deads2k · 2017-06-21T22:55:52Z

re[test]

deads2k · 2017-06-22T12:00:22Z

re[test]

deads2k · 2017-06-22T12:07:10Z

Comments were minor and addressed. I plan to merge on green.

deads2k · 2017-06-22T13:03:35Z

@openshift/networking Can I get some help looking at the networking test here: https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended_networking_minimal/3543 ? I'm assuming my problem is Reason:KubeletNotReady Message:runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized, but I don't know where to go from there.

deads2k · 2017-06-22T16:52:56Z

[merge]

deads2k · 2017-06-22T17:04:29Z

severity:blocker

removed tag at request of eparis

dcbw · 2017-06-24T02:52:31Z

@openshift/networking Can I get some help looking at the networking test here: https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended_networking_minimal/3543 ? I'm assuming my problem is Reason:KubeletNotReady Message:runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized, but I don't know where to go from there.

It means that openshift-sdn hasn't been able to get the ClusterNetwork object from the apiserver, or that it hasn't been able to read its own HostNetwork allocation from the master. In both these cases, the plugin cannot initialized, which consists of writing a config file to /etc/cni/net.d which then kubelet looks for in a loop and when it finds one, magically updates node network readiness.

You should see log messages about the SDN reading the cluster network. But I'm not sure how to get node messages out of AWS. It should be fairly easy to reproduce locally though; with your branch, can you just run:

hack/dind-cluster.sh start

and then if it fails the same way the extended test does, do:

docker exec -it openshift-node-2 bash

and then once that's dropped you into the node:

journalctl -b -u openshift-node > /tmp/node.log
exit
docker cp openshift-node-2:/tmp/node.log /tmp/node.log

and then somehow get node.log to one of us.

danwinship · 2017-06-26T11:46:59Z

But I'm not sure how to get node messages out of AWS

Click on "S3 Artifacts" on the side of the test run, scroll down to near the bottom, click "scripts/networking-minimal/logs/multitenant/nettest-node-1/systemd.log.gz".

The nodes are failing with:

node.go:325] error: SDN node startup failed: failed to get subnet for this host: nettest-node-1, error: timed out waiting for the condition

which means the master isn't correctly running the SDN controller. Actually, the master logs don't seem to show the openshift master running at all... the only "openshift-master"-related thing I see is:

Jun 22 12:53:06 nettest-master systemd-journald[18]: Suppressed 3069 messages from /system.slice/openshift-master.service

The logs go another 2 minutes after that but have nothing from the openshift master. That seems really weird. If it had crashed, systemd should log something, so I guess it didn't crash, but maybe it deadlocked or something?

I'm going to try running this PR locally...

danwinship · 2017-06-26T11:56:31Z

oh, nm, apparently already fixed

deads2k · 2017-06-26T13:19:45Z

re[merge]

deads2k · 2017-06-26T13:27:36Z

re[test]

deads2k · 2017-06-26T15:45:21Z

re[test]

deads2k · 2017-06-26T16:52:15Z

re[test]

deads2k · 2017-06-26T20:23:47Z

re[test]

smarterclayton · 2017-06-26T22:11:17Z

Needs rebase looks like

openshift-bot · 2017-06-27T12:07:29Z

Evaluated for origin test up to f80e65c

openshift-bot · 2017-06-27T12:07:37Z

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 12

openshift-bot · 2017-06-27T12:15:30Z

Evaluated for origin merge up to f80e65c

openshift-bot · 2017-06-27T14:08:13Z

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2690/) (Base Commit: bbb9647) (PR Branch Commit: f80e65c)

smarterclayton · 2017-06-27T14:18:52Z

Flake, merged at head

deads2k mentioned this pull request Jun 20, 2017

Remove last uses of internal namespaces in controller #14732

Merged

deads2k force-pushed the start-01-controllers branch from 4b01eba to 2221823 Compare June 20, 2017 14:40

openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 20, 2017

deads2k force-pushed the start-01-controllers branch 2 times, most recently from a602e60 to db3e978 Compare June 21, 2017 12:06

mfojtik reviewed Jun 21, 2017

View reviewed changes

openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 21, 2017

liggitt added this to the 3.6.0 milestone Jun 21, 2017

deads2k force-pushed the start-01-controllers branch 4 times, most recently from 05b129a to 9e38e36 Compare June 21, 2017 19:40

deads2k force-pushed the start-01-controllers branch from 9e38e36 to 6d4f687 Compare June 22, 2017 14:08

openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 26, 2017

refactor openshift start to separate controllers and apiserver

f80e65c

deads2k force-pushed the start-01-controllers branch from 6d4f687 to f80e65c Compare June 27, 2017 12:02

smarterclayton merged commit ede15e3 into openshift:master Jun 27, 2017

deads2k deleted the start-01-controllers branch August 3, 2017 19:27

split controller and apiserver start #14775

split controller and apiserver start #14775

Conversation

deads2k commented Jun 20, 2017

deads2k commented Jun 20, 2017

deads2k commented Jun 20, 2017

deads2k commented Jun 20, 2017

stevekuznetsov commented Jun 20, 2017

deads2k commented Jun 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfojtik commented Jun 21, 2017

deads2k commented Jun 21, 2017

deads2k commented Jun 21, 2017

deads2k commented Jun 22, 2017

deads2k commented Jun 22, 2017

deads2k commented Jun 22, 2017

deads2k commented Jun 22, 2017

deads2k commented Jun 22, 2017 • edited Loading

dcbw commented Jun 24, 2017

danwinship commented Jun 26, 2017

danwinship commented Jun 26, 2017

deads2k commented Jun 26, 2017

deads2k commented Jun 26, 2017

deads2k commented Jun 26, 2017

deads2k commented Jun 26, 2017

deads2k commented Jun 26, 2017

smarterclayton commented Jun 26, 2017

openshift-bot commented Jun 27, 2017

openshift-bot commented Jun 27, 2017 • edited Loading

openshift-bot commented Jun 27, 2017

openshift-bot commented Jun 27, 2017

smarterclayton commented Jun 27, 2017

deads2k commented Jun 22, 2017 •

edited

Loading

openshift-bot commented Jun 27, 2017 •

edited

Loading