Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLBC] Support backside re-encryption #519

Merged
merged 7 commits into from
Apr 18, 2017

Conversation

nicksardo
Copy link
Contributor

@nicksardo nicksardo commented Mar 29, 2017

Enables proxying traffic to HTTPS k8s services.

Fixes #494 - Support HTTPS backend
Fixes #475 - RPS per instance instead of instance group

Requires:

Surfacing Service Annotation

Ideally, the service annotation parsing function is moved here: https://github.com/kubernetes/kubernetes/blob/master/pkg/api/service/annotations.go . However, the ingress repo consumes the client-go api and k8s still consumes the local api. @.caesarxuchao mentions that all types will be moved out of k8s and client-go (kubernetes/kubernetes#44065) in a few weeks. He suggests we place the annotation function locally until things change.

Overview of Changes

  • Added ability for controller to create compute.HttpsHealthChecks
  • Abstract HTTP/HTTPSHealthCheck with a struct called HealthCheck
  • Modified healthchecks package and interface to separate the instantiation of a health check (New...) from creating it in the cloud (`Add...).
  • Moved the GCETranslator presence from healthchecks to backends
  • Addition/cleanup of fakes and tests for healthchecks
  • Plumbing of tuple (nodeport int64, encrypted bool) instead of nodeport int64
  • Modified backends to create HTTPS compute.BackendServices if encrypted=true

Test Cases

  • "Perfect setup": Create ingress with default backend pointing to a service (port 443), service has port https=443, service annotation has ["https"]
  • "Update ingress port": Create ingress pointing to service's port 80. Service already has port https=443 and annotation. Update ingress to point to 443
  • "Update service annotation":Create ingress pointing to service's port 443. Service already has port https=443 but NO annotation. Set annotation to ["https"]

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 29, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@nicksardo nicksardo self-assigned this Mar 29, 2017
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling d7eaa26f0cb515d20f8357a61b4076d5611776d4 on nicksardo:backside-reencrypt into ** on kubernetes:master**.

@aledbf
Copy link
Member

aledbf commented Mar 29, 2017

update godeps in ingress repo (compute & k8s)

I am doing this in #463 and also migrating to client-go.

@nicksardo
Copy link
Contributor Author

kubernetes/kubernetes#43644 was merged after 1.6.0 release. After talking with some folks here, it may be possible to cherry pick into 1.6.1.

@nicksardo nicksardo force-pushed the backside-reencrypt branch 2 times, most recently from 994fda9 to 13b2d7b Compare April 5, 2017 22:06
@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 44.696% when pulling 13b2d7b7b1b97a1e8b7436b0fd65c0b68a8bf2ae on nicksardo:backside-reencrypt into ed6987e on kubernetes:master.

@aledbf
Copy link
Member

aledbf commented Apr 5, 2017

@nicksardo why not adding this as a field in Ingress?
(this is already present as an annotation secureupstream in nginx)

@nicksardo
Copy link
Contributor Author

@aledbf Makes more sense as a descriptor for the Service object. Probably not a common case, but there may be users with multiple ingress objs referencing a single service. It also touches the subject of service discovery which was a proposal from long ago. Entities other than ingress controllers may want descriptive information about services. kubernetes/kubernetes#41678

@aledbf
Copy link
Member

aledbf commented Apr 5, 2017

Makes more sense as a descriptor for the Service object

👍

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 44.649% when pulling 407fe6339a26847421b685ad73a9a8cb628fe3f1 on nicksardo:backside-reencrypt into ed6987e on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 44.649% when pulling 407fe6339a26847421b685ad73a9a8cb628fe3f1 on nicksardo:backside-reencrypt into ed6987e on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 44.649% when pulling 407fe6339a26847421b685ad73a9a8cb628fe3f1 on nicksardo:backside-reencrypt into ed6987e on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 44.7% when pulling 3479c42dbec34e0f641f639913c09c7a8857b739 on nicksardo:backside-reencrypt into ed6987e on kubernetes:master.

@nicksardo nicksardo force-pushed the backside-reencrypt branch 2 times, most recently from b1851de to 5f53f0b Compare April 7, 2017 16:16
@coveralls
Copy link

Coverage Status

Coverage increased (+0.5%) to 44.728% when pulling 5f53f0b on nicksardo:backside-reencrypt into 7c635a8 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.8%) to 45.049% when pulling 5f53f0b on nicksardo:backside-reencrypt into 7c635a8 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.8%) to 45.049% when pulling cba5bc6 on nicksardo:backside-reencrypt into 7c635a8 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 45.075% when pulling cba5bc6 on nicksardo:backside-reencrypt into 7c635a8 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 45.075% when pulling 81a1263 on nicksardo:backside-reencrypt into 7c635a8 on kubernetes:master.

@nicksardo nicksardo changed the title [GLBC] [WIP] Support backside re-encryption [GLBC] Support backside re-encryption Apr 7, 2017
@nicksardo nicksardo assigned bowei and unassigned nicksardo Apr 7, 2017
@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 45.075% when pulling 9bf3a9c on nicksardo:backside-reencrypt into 7c635a8 on kubernetes:master.

@nicksardo nicksardo force-pushed the backside-reencrypt branch 2 times, most recently from 06f03c4 to 94da993 Compare April 12, 2017 05:31
@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.122% when pulling fe3ebc1157b3025a20ace99393ad77e7aad8854f on nicksardo:backside-reencrypt into 03cae88 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.151% when pulling b53061597c4531354e933c432833880e5910cebe on nicksardo:backside-reencrypt into 03cae88 on kubernetes:master.

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs changes LGTM

glog.Infof("Creating backend for %d instance groups, port %v named port %v",
len(igs), port, namedPort)
be, err = b.create(igs, namedPort, b.namer.BeName(port))
glog.Infof("Creating backend for %d instance groups, port %v named port %v", len(igs), p.Port, namedPort)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add V(1 or 2) I think?


pProto := utils.GetHTTPScheme(p.Encrypted)
if be.Protocol != pProto {
glog.Infof("Updating backend protocol %v from %v to %v", pName, be.Protocol, pProto)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glog.V

}
pool := b.snapshotter.Snapshot()
for port := range pool {
p, err := strconv.Atoi(port)
p, err := strconv.ParseInt(port, 10, 64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the port 16-bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe existing code is following the compute library's use of int64 for Port. https://godoc.org/google.golang.org/api/compute/v1#NamedPort

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood - are you suggesting we assert that the port can fit in 16bits? We could but it would be unnecessary as the string is generated from within the same package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline, probably should use strconv.ParseInt(port, 10, 16) to be consistent with port being uint16_t

healthPath := p.Handler.HTTPGet.Path
// GCE requires a leading "/" for health check urls.
if !strings.HasPrefix(healthPath, "/") {
healthPath = fmt.Sprintf("/%v", healthPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

healthPath = "/" + healthPath, seems heavyweight to use Sprintf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, was just moving, but I'll make the change.

hc.RequestPath = healthPath
hc.Host = host
hc.Description = "kubernetes L7 health check from readiness probe."
// set a low health threshold and a high failure threshold.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment seems like it should be attached to healthchecks.DefaultHealthCheckInterval instead of here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching!

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore LGTM -- thought change was only docs

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.105% when pulling b2a280d0b260906205deaa9886254e90402ae1f1 on nicksardo:backside-reencrypt into 03cae88 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.0%) to 47.181% when pulling c42b7c0d9bfca5c6c2e9ec5c85f0487115c46373 on nicksardo:backside-reencrypt into 03cae88 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.196% when pulling 6af52e7 on nicksardo:backside-reencrypt into 4817ddf on kubernetes:master.

}
pool := b.snapshotter.Snapshot()
for port := range pool {
p, err := strconv.Atoi(port)
p, err := strconv.ParseInt(port, 10, 64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline, probably should use strconv.ParseInt(port, 10, 16) to be consistent with port being uint16_t

// borked, service level outages will get detected sooner
// by kube-proxy.
// DefaultHealthCheckInterval defines how frequently a probe runs
DefaultHealthCheckInterval = 60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to 60 * time.Second, then the duration is more self-documenting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this would be desirable. However, the compute api takes int64 of seconds. If we made this a time.Duration, we would need to int64(DefaultHealthCheckInterval.Seconds()) which isn't ideal.

Should we settle for renaming to DefaultHealthCheckIntervalSeconds?

// DefaultUnhealthyThreshold defines the threshold of failure probes that declare a backend "unhealthy"
DefaultUnhealthyThreshold = 10
// DefaultTimeoutSeconds defines the timeout of each probe
DefaultTimeoutSeconds = 60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

wantHC.RequestPath = h.defaultPath

if existingHC.Protocol() != hc.Protocol() {
glog.Infof("Updating health check %v because it has protocol %v but need %v", existingHC.Name, existingHC.Type, hc.Type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.V(2)

// TODO: reconcile health checks, and compare headers interval etc.
// Currently Ingress doesn't expose all the health check params
// natively, so some users prefer to hand modify the check.
glog.Infof("Unexpected request path on health check %v, has %v want %v, NOT reconciling",
name, hc.RequestPath, wantHC.RequestPath)
glog.Infof("Unexpected request path on health check %v, has %v want %v, NOT reconciling", hc.Name, existingHC.RequestPath, hc.RequestPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.V(2)


service = *as.(*api_v1.Service)
for _, sp := range service.Spec.Ports {
svcPort = sp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move down into the if

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain my thought process on this styling choice. Ideally, svcPort is a pointer so nil can represent not found. However, service.Spec.Ports[N] is a struct which is why svcPort is a struct. You cannot take the address of sp because it's changing with every loop iteration. The other options are:

  1. Declare a new struct for each iteration and instantiate with value of sp, then take address of that variable if port is matched. Then you can check the pointer for existence, but you're not saving any lines of codes.
  2. Move svcPort=sp into if and check if ServicePort is a zero-value struct. Of course, if svcPort.Port != 0 is a bit of an eyesore.

IMO, the current implementation and first alternative are equally fine and a matter of preference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

return "", err
}
if probe != nil {
glog.V(1).Infof("Applying httpGet settings of readinessProbe to health check on port %+v", sp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V(2)

@@ -231,7 +286,7 @@ func (b *Backends) Add(port int64) error {
// Delete deletes the Backend for the given port.
func (b *Backends) Delete(port int64) (err error) {
name := b.namer.BeName(port)
glog.Infof("Deleting backend %v", name)
glog.Infof("Deleting backend service %v", name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.V(2)

len(igs), port, namedPort)
be, err = b.create(igs, namedPort, b.namer.BeName(port))
glog.V(1).Infof("Creating backend for %d instance groups, port %v named port %v", len(igs), p.Port, namedPort)
be, err = b.create(igs, namedPort, hcLink, p.Protocol, pName)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for be to be nil when it gets to here? Is this going to be a problem for the defer'd function above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely possible be could be added to the snapshot as nil. However, this is as expected. I double-checked that all code fetching items from snapshotter are looking at the key and not at the backend-service value.

}

if be.Protocol != string(p.Protocol) || existingHCLink != hcLink {
glog.Infof("Updating backend protocol %v (%v) for change in protocol (%v) or health check", pName, be.Protocol, string(p.Protocol))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.V(2)

@coveralls
Copy link

Coverage Status

Coverage increased (+1.0%) to 47.224% when pulling 04af243 on nicksardo:backside-reencrypt into 4817ddf on kubernetes:master.

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coveralls
Copy link

Coverage Status

Coverage increased (+1.0%) to 47.224% when pulling 5fe199b on nicksardo:backside-reencrypt into 4817ddf on kubernetes:master.

@nicksardo nicksardo merged commit 642cb74 into kubernetes:master Apr 18, 2017
@nicksardo nicksardo deleted the backside-reencrypt branch April 18, 2017 19:44
@ibawt
Copy link
Contributor

ibawt commented Apr 21, 2017

@nicksardo I see this is merged, how long until I see this in GKE? This will be quite relevant in our decision to either use the GCE load balancer or nginx for our soft launch.

@nicksardo
Copy link
Contributor Author

#609 warrants a release of the GLBC and version bump in the manifest file for all release branches. Could land in 1.6.3 or 1.6.4 and GKE usually updates on patches about a week or two after release. I'll ping you from the version bump PR so you can track.

@nicksardo
Copy link
Contributor Author

This feature was released in v0.9.3 which is included in K8s 1.6.3. @ibawt

@ibawt
Copy link
Contributor

ibawt commented May 12, 2017

@nicksardo you are my favourite person in the world :O .

1.6.3 lines up nicely with our schedule

@nicksardo
Copy link
Contributor Author

@ibawt and others,

Have any of you had a chance to use this feature yet? Any comments? Thanks

@ibawt
Copy link
Contributor

ibawt commented Jun 14, 2017

@nicksardo yes, currently running HTTPS backends on two separate( read fairly large) deployments.
Everything worked as advertised, biggest pain was in the HC's updating in 1.6.4, so if we misplaced a readinessProbe and manually fixed, it went and updated it back to /. That's really our problem but did cause some goblins to come out.

I also tried to swap the service port, ie from 80 to 443 to indicate at least at a glance that it will be https, that part didn't work so hot. If I left it to 80 everything went fine, I even changed the targetPort. It looked something was doing an equality check, I was going deep dive into but haven't had a chance.

So tl;dr works great, thanks a bunch. The HC updating shook some goblins out that we should have done correctly anyways.

One thing I didn't find in the docs is what should I be using for the cert pair on the service backend? I ended up using the same cert as the ingress. Can we use different ones? How does the verification work? Can we use self signed if we provide the root?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants