-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replacing hard coded cluster.local with variable #2892
Conversation
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
pkg/utils/utils.go
Outdated
"strings" | ||
) | ||
|
||
// GetClusterDomainName returns cluster's domain name or error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory for getters go suggests omitting Get, so I'd suggest ClusterDomainName (same below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment, having just ClusterDomainName sounds a bit unclear what this func does. I do not have any strong opinion about exact this name, but I would like it to be a bit self explanatory. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So Getter/Setter convention in Go is for X is: SetX() and X(), i.e. Get is redundant.
Considering that here it's not a true getter on a struct, I don't want to be prescriptive, but ClusterDomainName() is shorter, and is readable for Go engineers in general. Up to you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am for consistency, if the rules you have stated were followed in serving repo, then it would have made sense to change func name, but if you check, currently get
is used in many funcs' names in serving:
.//pkg/reconciler/names.go:func GetK8sServiceFullname(name string, namespace string) string {
.//pkg/reconciler/names.go:func GetServingK8SServiceNameForObj(name string) string {
and in many more, as such I prefer to follow this pattern and leave this func name as is.
pkg/utils/utils.go
Outdated
|
||
// GetClusterDomainName returns cluster's domain name or error | ||
func GetClusterDomainName() (string, error) { | ||
_, err := os.Stat("/etc/resolv.conf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if _, err :=...; err != nil {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for catching it, will change
pkg/utils/utils.go
Outdated
if err != nil { | ||
return "", err | ||
} | ||
f, err := os.Open("/etc/resolv.conf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const for the file name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do,thanks
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
pkg/reconciler/names.go
Outdated
|
||
func GetK8sServiceFullname(name string, namespace string) string { | ||
return fmt.Sprintf("%s.%s.svc.cluster.local", name, namespace) | ||
clusterDomainName, err := utils.GetClusterDomainName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we compute this once when the binary starts up so that we won't read a file everytime we want to get the full name of a K8s Service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this operation is infrequent so the should not be any quantifiable performance impact. Also this approach allows different reaction of a failure of getting domain name. In some cases it might be useful. Appreciate other folks to chime in and share their opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's going to be constant, I don't see reasons for it to be recomputed each time.
If we need to test failure cases, then go provides various methods to spoof failures (substitute files, use different file path, mockout internal function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, does it work for you, if I call it from main, set global var in utils package and then when domain name is needed utils.ClusterDomainName var can be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather, I'd recommend to use this: https://golang.org/pkg/sync/#Once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see new solution based on once.Do()
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
pkg/utils/utils_test.go
Outdated
} | ||
for _, tt := range tests { | ||
dn, err := getClusterDomainName(strings.NewReader(tt.resolvConf)) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if got, want := (err != nil), t.shouldFail; got != want {
t.Errorf("....")
} else if err == nil {
if got, want := domain, t.domainName; got != want {
t.Errorf("... domain: got: %s, want: %s",...)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this sequence, it does not catch directly func returns err == nil but the expectation was to fail. Do you think it is sufficient to catch this condition just on domain name mismatch? Currently err==nil validation covers 2 scenarios, 1 incorrect parsing of domain name when func returns nil and wrong domain name and 2 not detecting corruption of resolv.conf file. Please let me know if this explanation makes sense to you. Appreciate a lot your feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not catch directly func returns err == nil but the expectation was to fail
-- wouldn't (err != nil) != t.shouldFail
condition catch that?
We want to make sure errors are returned when it's an error condition. But since your tests don't discriminate errors, just making sure error was returned in the erroneous condition will suffice. That's what my suggested change does.
Hope, this makes sense :)
pkg/utils/utils_test.go
Outdated
|
||
func TestGetDomainName(t *testing.T) { | ||
tests := []struct { | ||
descr string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/descr/name/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/reconciler/names.go
Outdated
|
||
func GetK8sServiceFullname(name string, namespace string) string { | ||
return fmt.Sprintf("%s.%s.svc.cluster.local", name, namespace) | ||
clusterDomainName, err := utils.GetClusterDomainName() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's going to be constant, I don't see reasons for it to be recomputed each time.
If we need to test failure cases, then go provides various methods to spoof failures (substitute files, use different file path, mockout internal function).
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
/test pull-knative-serving-integration-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
/lgtm
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
/test pull-knative-serving-integration-tests |
Signed-off-by: Serguei Bezverkhi <sbezverk@cisco.com>
The following is the coverage report on pkg/.
|
With this patch, I can see success in both control plane and data plane, k8s event source generates messages and they are transferred to the user container via the im memory channel dispatcher :) !!! |
@@ -302,7 +303,7 @@ func main() { | |||
}() | |||
|
|||
// Open a websocket connection to the autoscaler | |||
autoscalerEndpoint := fmt.Sprintf("ws://%s.%s:%d", servingAutoscaler, system.Namespace, servingAutoscalerPort) | |||
autoscalerEndpoint := fmt.Sprintf("ws://%s.%s.svc.%s:%d", servingAutoscaler, system.Namespace, utils.GetClusterDomainName(), servingAutoscalerPort) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? (or why part of this pr?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queue-proxy was constantly failing to establish connectivity with autoscaller, had to fix it to be able to prove that domain name change it working.
{"level":"info","ts":"2019-01-12T05:29:12.179Z","logger":"queueproxy","caller":"queue/main.go:306","msg":"Connecting to autoscaler at ws://autoscaler.knative-serving:8080","commit":"4d198db","knative.dev/key":"default/k8s-event-dumper-00001","knative.dev/pod":"k8s-event-dumper-00001-deployment-59466d48f-x9zxf"}
{"level":"error","ts":"2019-01-12T05:29:13.179Z","logger":"queueproxy","caller":"queue/main.go:114","msg":"Error while sending stat{error 25 0 connection has not yet been established}","commit":"4d198db","knative.dev/key":"default/k8s-event-dumper-00001","knative.dev/pod":"k8s-event-dumper-00001-deployment-59466d48f-x9zxf","stacktrace":"main.statReporter\n\t/usr/local/google/home/mattmoor/go/src/github.com/knative/serving/cmd/queue/main.go:114"}
/lgtm |
/approve |
/assign @mdemirhan for /approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one comment related to closing a file in case of error
once.Do(func() { | ||
f, err := os.Open(resolverFileName) | ||
if err == nil { | ||
defer f.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to put the defer
in front of If
so we close the file even when the error happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we close the file only if open succeed, otherwise there is nothing to close, f would be nil.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mdemirhan, sbezverk, tcnghia The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I got the following errors for queue-proxy 0.0.3:
Wondering whether this PR will fix it. |
@jingweno I saw similar error too, but this PR addresses issues when cluster name is not |
@sbezverk The cluster name is
I wonder what went wrong...running out of ideas |
@markusthoemmes There are only a few of the errors. But readiness check for port 8022 (queue-proxy) always failed and caused the pod to be terminated:
Yes, I'm on 1.11+:
|
Signed-off-by: Serguei Bezverkhi sbezverk@cisco.com
PR replaces hardcoded cluster.local with a variable which is populated from /etc/resolv.conf with actual cluster's domain name. In case of a failure it falls back to cluster.local
Fixes #knative/eventing#714