-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1450291 - Improve logs in image pruning #14405
Conversation
[test] |
LGTM |
pkg/image/prune/prune.go
Outdated
return deleteFromRegistry(registryClient, fmt.Sprintf("%s/v2/%s/manifests/%s", registryURL, repoName, manifest)) | ||
} | ||
|
||
func getName(o metav1.ObjectMeta) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this take runtime.Object
instead and access the attributes via o.(metav1.ObjectMetaAccessor).GetObjectMeta()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... lemme check this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accessor, err := meta.Accessor(obj)
and then using accessor.GetName()
and accessor.GetNamespace
is probably better. Will got with that.
return fmt.Errorf("error sending request: %v", err) | ||
if proto != "https" && strings.Contains(err.Error(), "malformed HTTP response") { | ||
return fmt.Errorf("%v.\n* Are you trying to connect to a TLS-enabled registry without TLS?", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will conflict with #14114.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update this now that the other one is in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe I should ask in what way it will conflict? This will give you a hint that you should use https when remote end supports that. #14114 will force you to use that, so this error should be seen very rarely, in some weird cases, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for the hint. Nevertheless, this should have been captured by the ping()
call.
I must have forgotten to modify this code same the way as the ping()
(see
origin/pkg/image/prune/prune.go
Line 199 in 90592c5
if drp.insecure || netutils.IsPrivateAddress(registry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, the hint would be useful in the pinger as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move it as a followup, in that case.
@@ -891,7 +889,7 @@ func (p *pruner) Prune( | |||
glog.V(1).Infof("Using registry: %s", registryURL) | |||
|
|||
if err := p.registryPinger.ping(registryURL); err != nil { | |||
return fmt.Errorf("error communicating with registry: %v", err) | |||
return fmt.Errorf("error communicating with registry %s: %v", registryURL, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the url always present in the err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some it might, in some it won't. I'd rather having it twice sometimes, than none in those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find these messages already too long and they are quite common. Also I will need to update documentation here. Looking at the current logging, it seems like this statement will sometimes triple the url. Could you please either remove it from the pinger or from here to make sure it's logged at least once but no more than twice?
Rebased and addressed comments. @miminar ptal once again |
|
||
if pod.Status.Phase != kapi.PodRunning && pod.Status.Phase != kapi.PodPending { | ||
age := metav1.Now().Sub(pod.CreationTimestamp.Time) | ||
if age >= algorithm.keepYoungerThan { | ||
glog.V(4).Infof("Pod %s/%s is not running or pending and age is at least minimum pruning age - skipping", pod.Namespace, pod.Name) | ||
// not pending or running, age is at least minimum pruning age, skip | ||
glog.V(4).Infof("Pod %s is not running nor pending and age exceeds keepYoungerThan (%v) - skipping", getName(pod), age) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/age exceeds/age is within/
The name of the variable is misleading 😄.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, I was struggling with that log message for a while. Your suggestion is definitely the best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, no, what I said above is wrong. It has to say exceeds, because we're referring to pods that are not added to graph (skipped), which will allow the images to be pruned (because of lack of strong references).
Cross-linking the future documentation PR openshift/openshift-docs#4471. |
Evaluated for origin test up to 7edd2ba |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'll do the followup for the deleter.
[merge] |
continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/1927/) (Base Commit: 90592c5) |
Flake #13559 |
Evaluated for origin merge up to 7edd2ba |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/923/) (Base Commit: cf5c12f) (Image: devenv-rhel7_6328) |
Automatic merge from submit-queue [3.6][Backport] Prune images (not)securely Back-porting: - #14114 - #14405 - #14914 - #15899 Resolves [bz#1476779](https://bugzilla.redhat.com/show_bug.cgi?id=1476779)
This partially address problem from https://bugzilla.redhat.com/show_bug.cgi?id=1450291 and cleans up the logging from my own experience going through them several times past few weeks.
@miminar @mfojtik ptal