Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support pod exec terminal logging #9385

Merged
merged 7 commits into from
May 17, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions server/application/terminal.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,9 @@ func (s *terminalHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
return
}

fieldLog := log.WithFields(log.Fields{"userName": sessionmgr.Username(ctx), "container": container,
"podName": podName, "namespace": namespace, "cluster": a.Spec.Destination.Name})

appRBACName := apputil.AppRBACName(*a)
if err := s.enf.EnforceErr(ctx.Value("claims"), rbacpolicy.ResourceApplications, rbacpolicy.ActionGet, appRBACName); err != nil {
http.Error(w, err.Error(), http.StatusUnauthorized)
Expand Down Expand Up @@ -150,7 +153,8 @@ func (s *terminalHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {

pod, err := kubeClientset.CoreV1().Pods(namespace).Get(ctx, podName, metav1.GetOptions{})
if err != nil {
http.Error(w, "Cannot find pod: "+podName, http.StatusBadRequest)
fieldLog.Warn("Terminal Pod Not Found")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an error and should be logged like so:

fieldLog.Errorf("error retrieving pod %s: %s", podName, err)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leoluz - thank you for checking again - we are already logging the podName. when fieldLog is called it will output ALL the fields (user, namespace, pod, container) to the log line. since it is using WithFields -

Note that it doesn't log until you call Debug, Print, Info, Warn, Fatal or Panic on the Entry it returns.
I left as "Terminal Pod Not Found" to improve the searchability and to create standardized alerts based on that key - we can assume the format of that structured log will stay the same and build alerts based on that key.

I defer to all the argocd folks but the distinction here which is important is user error vs application error.
It makes sense that this throws an http error to the user.
However it is not actually an application error - the application is performing normally.
From my perspective it would be better to be Warn instead of Error for application log level.
As an operator I would want to know that this is happening (Warn) but I would not consider this an indicator that argocd is unhealthy, having an issue, or doing something broken or unexpected.

Let me know if I should change it or if the above doesn't make sense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the fact that we don't need the pod name as it is already part of registered fields. However I believe this still needs to be logged as an error. It is ArgoCD internal code that does the call to kube-api to retrieve the pod by its name. If this request fails for some reason it is an internal error and should be logged like so. Retrieving the pod call can fail by several reasons and logging it as terminal pod not found is misleading.
I suggest to change this log to:

fieldLog.Errorf("error retrieving pod: %s", err)

By the way, log messages should be all lowercase sentences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leoluz what if a Pod is in the resource tree but not synced? In that case an error from the k8s API would be a user error rather than an application error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leoluz - good point - I have updated it and also changed the log messages to all lowercase

Copy link
Collaborator

@leoluz leoluz May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leoluz what if a Pod is in the resource tree but not synced? In that case an error from the k8s API would be a user error rather than an application error.

@crenshaw-dev My understanding is that (contrary to HTTP return codes) from the log perspective, it doesn't matter if the error was caused by the user or by something internal. Our code invoked a method that returned an error and in this case an error should be logged. However there is another "subtle" problem in this case: client-go returns an error if for example the resource isn't found. This isn't an exceptional scenario and if we want to be precise to whether or not log an error (and decide the proper http return code) we need to do an extra check:

if err != nil {
    if apimachineryerrors.IsNotFound(err) {
        // don't log error
        // http 404
    } else {
        // log error
        // http 5xx
    }
}

http.Error(w, "Cannot find pod", http.StatusBadRequest)
return
}

Expand All @@ -159,20 +163,20 @@ func (s *terminalHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
return
}

var foundContainerName string
var findContainer bool
for _, c := range pod.Spec.Containers {
if container == c.Name {
foundContainerName = c.Name
findContainer = true
break
}
}
if foundContainerName == "" {
http.Error(w, "Cannot find container: "+container, http.StatusBadRequest)
if !findContainer {
fieldLog.Warn("Terminal Container Not Found")
http.Error(w, "Cannot find container", http.StatusBadRequest)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For auditing it would be interesting to also have this logged (as a warning) with all info: cluster, namespace, pod name and container name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL might complain about logging the un-sanitized container input. But since we're validating the container name above, I think it would be safe to log (and to override the CodeQL warning).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leoluz - yes this is the issue we were trying to avoid - Log entries created from user input High This log write receives unsanitized user input from here.
It is unsafe to log verbatim whatever was posted to URL params which is why it is not logged.
In theory it should rarely happen as most requests should come directly from the argocd application itself and only post namespace + pod + container as already found in argocd. I agree it would be useful to be able to debug though. If a user really wants to know why it can't be found wondering if there is wireshark/tcpdump some level of capturing the http request instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unsafe to log verbatim whatever was posted to URL params which is why it is not logged.

@smcavallo but you are previously sanitizing isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sanitizing, but validating. Which I think should be enough.

If we're reaching this point of the code, then we've shown the user is authenticated and authorized to get the application and create on the exec resource. I'm not too worried about this user filling up the disk with this log line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leoluz and @crenshaw-dev - totally make sense - we've already validated so it's OK to log these. I have added the additional info to these logs.

return
}

log.WithFields(log.Fields{"userName": sessionmgr.Username(ctx), "container": foundContainerName,
"podName": pod.Name, "namespace": pod.Namespace, "cluster": a.Spec.Destination.Name}).Info("Serving Terminal")
fieldLog.Info("Serving Terminal")

session, err := newTerminalSession(w, r, nil)
if err != nil {
Expand Down