Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace kube-core rendering with openshift operators render #420

Merged
merged 7 commits into from
Oct 18, 2018

Conversation

mfojtik
Copy link
Contributor

@mfojtik mfojtik commented Oct 5, 2018

This change moves us away from kube-core renderer and use the kube-apiserver, kube-controller-manager and kube-scheduler operator renderers to gather all necessary bootstrap manifests.

@abhinavdahiya @smarterclayton this is what we want and we should make working.

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 5, 2018
@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 5, 2018

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2018
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 8, 2018
defaultReleaseImage = "registry.svc.ci.openshift.org/openshift/origin-release:v4.0"
rootDir = "/opt/tectonic"

// TODO: This should be decided by installer, not hard-coded
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton how are we supposed to handle this? :-) Also what is the preferred registry for our images? The RHCOS seems to have RH registry hardcoded as default, so using the :latest tag does not work and we really don't want to pin this into stable tag, otherwise we won't be able to test the changes in our operator images....

I guess we want all images in quay.io at some point but we want a knob to switch that to registry.svc.ci.openshift.org for the CI?

@mfojtik mfojtik force-pushed the remove-kco-render branch 6 times, most recently from 04a64c9 to f2e6925 Compare October 8, 2018 16:03
@abhinavdahiya
Copy link
Contributor

so the kube-core-operator installs these things in addition to api

1. kube-proxy
2. kube-dns
3. pod-checkpointer

What is the plan for these?

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2018
@derekwaynecarr
Copy link
Member

/cc @sjenning we need to discuss who shall install pod-checkpointer

@openshift-ci-robot
Copy link
Contributor

@derekwaynecarr: GitHub didn't allow me to request PR reviews from the following users: need, install, who, shall, we, to, discuss.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @sjenning we need to discuss who shall install pod-checkpointer

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 10, 2018
@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 10, 2018

  1. kube-proxy
  2. kube-dns

/cc @ironcladlou

@@ -29,7 +29,9 @@ func (a *KubeletCertKey) Generate(dependencies asset.Parents) error {
dependencies.Get(kubeCA)

cfg := &CertCfg{
Subject: pkix.Name{CommonName: "system:serviceaccount:kube-system:default", Organization: []string{"system:serviceaccounts:kube-system"}},
// system:masters is a hack to get the kubelet up without kube-core
// TODO(node): make kubelet bootstrapping secure with minimal permissions eventually switching to system:node:* CommonName
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @sjenning

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 10, 2018
@deads2k
Copy link
Contributor

deads2k commented Oct 17, 2018

/test all

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 17, 2018

/retest

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 17, 2018

@abhinavdahiya
Copy link
Contributor

the e2e is failing here

Error: Error applying plan:

6 error(s) occurred:

* module.vpc.aws_eip.nat_eip[5]: 1 error(s) occurred:

* aws_eip.nat_eip.5: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: c512d767-2493-4812-9561-5d1197723cf7
* module.vpc.aws_eip.nat_eip[0]: 1 error(s) occurred:

* aws_eip.nat_eip.0: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: cac50c99-211c-49fc-9204-1c9e99f03ae7
* module.vpc.aws_eip.nat_eip[3]: 1 error(s) occurred:

* aws_eip.nat_eip.3: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 77877ad8-d126-4850-894c-01e31861aa08
* module.vpc.aws_eip.nat_eip[4]: 1 error(s) occurred:

* aws_eip.nat_eip.4: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 4ab4d76f-ae87-41d5-b51e-4905882fe710
* module.vpc.aws_eip.nat_eip[1]: 1 error(s) occurred:

* aws_eip.nat_eip.1: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: 952b9a6f-5742-4d1d-87ea-44f5b80d796e
* module.vpc.aws_eip.nat_eip[2]: 1 error(s) occurred:

* aws_eip.nat_eip.2: Error creating EIP: AddressLimitExceeded: The maximum number of addresses has been reached.
	status code: 400, request id: aaaed3f9-64d7-4d82-975c-81eeda85da7e

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

I am going to ask @wking to cleanup the CI account. :(

@abhinavdahiya
Copy link
Contributor

/test e2e-aws

@abhinavdahiya
Copy link
Contributor

/retest

@wking
Copy link
Member

wking commented Oct 17, 2018

The previous e2e-aws error was:

2018/10/17 22:42:09 Executing template e2e-aws
2018/10/17 22:42:09 Creating or restarting template instance
2018/10/17 22:42:09 Waiting for template instance to be ready
2018/10/17 22:42:12 Ran for 2m5s
error: could not run steps: could not wait for template instance to be ready: could not determine if template instance was ready: failed to create objects: object is being deleted: pods "e2e-aws" already exists

That's openshift/release#1955.

@deads2k
Copy link
Contributor

deads2k commented Oct 17, 2018

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/openshift_installer/420/pull-ci-openshift-installer-master-e2e-aws/781

/tmp/openshift/build-rpms/rpm/BUILD/origin-4.0.0/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:684
Expected error:
    <*errors.errorString | 0xc420a0f160>: {
        s: "failed to get logs from pod-secrets-69066412-d263-11e8-bde6-0a58ac1018e2 for secret-volume-test: an error on the server (\"unknown\") has prevented the request from succeeding (get pods pod-secrets-69066412-d263-11e8-bde6-0a58ac1018e2)",
    }
    failed to get logs from pod-secrets-69066412-d263-11e8-bde6-0a58ac1018e2 for secret-volume-test: an error on the server ("unknown") has prevented the request from succeeding (get pods pod-secrets-69066412-d263-11e8-bde6-0a58ac1018e2)
not to have occurred
/tmp/openshift/build-rpms/rpm/BUILD/origin-4.0.0/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/util.go:2351

@sttts @mfojtik maybe we hadn't gotten this far before?

@deads2k
Copy link
Contributor

deads2k commented Oct 17, 2018

https://github.com/openshift/origin/pull/21286/files merged. Missing the operator half maybe?

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

bootkube: override control-plane configs – fix kubectl logs
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 18, 2018
@sttts
Copy link
Contributor

sttts commented Oct 18, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 18, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, mfojtik, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

apiVersion: kubecontrolplane.config.openshift.io/v1
kind: KubeAPIServerConfig
kubeletClientInfo:
ca: "" # kubelet uses self-signed serving certs. TODO: fix kubelet pki
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjenning this is for you. We have self-signed kubelet serving certs right now. I.e. the kube-apiserver->kubelet communication is insecure because the kubelet identity is not checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the old kube-core behaviour btw. So we are just copying that here, but should do better. I read @LiGgit commenting somewhere that up to kube 1.10 we don't support CSRs for properly signed kubelet serving certs. Has this changed upstream in 1.11 or 1.12?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KubeControllerManagerConfigOverridesTemplate = template.Must(template.New("kube-controller-manager-config-overrides.yaml").Parse(`
apiVersion: kubecontrolplane.config.openshift.io/v1
kind: KubeControllerManagerConfig
`))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squeed this file is also for you. You can override any setting related to networking here, like cluster CIDR for example.

@openshift-merge-robot openshift-merge-robot merged commit ace571b into openshift:master Oct 18, 2018
wking added a commit to wking/openshift-installer that referenced this pull request Oct 20, 2018
Catching up with c9b0e2f (manifests: stop using kube core operator,
2018-10-08, openshift#420).

Generated with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0
   build date  :
   git hash    : 22125cf
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false
@wking wking mentioned this pull request Oct 20, 2018
wking added a commit to wking/openshift-installer that referenced this pull request Oct 20, 2018
Catching up with c9b0e2f (manifests: stop using kube core operator,
2018-10-08, openshift#420).
wking added a commit to wking/openshift-installer that referenced this pull request Oct 20, 2018
Adds ClusterK8sIO from e2dc955 (pkg/asset: add ClusterK8sIO,
machines.Worker assets, 2018-10-15, openshift#468), Master from 586ad45
(pkg/asset: Add asset for Master machines, 2018-10-18, openshift#491).  Removes
KubeCoreOperator from c9b0e2f (manifests: stop using kube core
operator, 2018-10-08, openshift#420).

Generated with:

  $ openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg

using:

  $ dot -V
  dot - graphviz version 2.30.1 (20170916.1124)
wking added a commit to wking/openshift-installer that referenced this pull request Oct 20, 2018
Adds ClusterK8sIO from e2dc955 (pkg/asset: add ClusterK8sIO,
machines.Worker assets, 2018-10-15, openshift#468) and Master from 586ad45
(pkg/asset: Add asset for Master machines, 2018-10-18, openshift#491).  Removes
KubeCoreOperator from c9b0e2f (manifests: stop using kube core
operator, 2018-10-08, openshift#420).

Generated with:

  $ openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg

using:

  $ dot -V
  dot - graphviz version 2.30.1 (20170916.1124)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.