feat: support pod exec terminal logging #9385

smcavallo · 2022-05-12T16:22:57Z

Signed-off-by: smcavallo smcavallo@hotmail.com

See - #8905
The exec feature is extremely powerful but lacks auditability.
Many orgs will require some auditing and history tracking of who is exec'ing into pods and containers.

This "feature" logs access to the Terminal/Exec feature.
When a terminal session is opened it will generate a log line.
This seemed to be the best place to add this logging and should have enough info for auditing purposes.

Note on DCO:

If the DCO action in the integration test fails, one or more of your commits are not signed off. Please click on the Details link next to the DCO action for instructions on how to resolve this.

Checklist:

[X ] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
[X ] The title of the PR states what changed and the related issues number (used for the release note).
I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
Does this PR require documentation updates?
I've updated documentation as required by this PR.
Optional. My organization is added to USERS.md.
[X ] I have signed off all my commits as required by DCO
I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
[X ] My build is green (troubleshooting builds).

codecov · 2022-05-12T16:35:03Z

Codecov Report

Merging #9385 (09642e7) into master (8cd7d47) will decrease coverage by 0.01%.
The diff coverage is 23.07%.

@@            Coverage Diff             @@
##           master    #9385      +/-   ##
==========================================
- Coverage   45.78%   45.76%   -0.02%     
==========================================
  Files         220      220              
  Lines       26165    26186      +21     
==========================================
+ Hits        11979    11985       +6     
- Misses      12529    12544      +15     
  Partials     1657     1657

Impacted Files	Coverage Δ
applicationset/generators/cluster.go	`76.56% <ø> (ø)`
server/application/application.go	`31.83% <ø> (ø)`
server/application/terminal.go	`7.57% <23.07%> (+3.97%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8cd7d47...09642e7. Read the comment docs.

crenshaw-dev

I think this is a really good idea! Added a couple thoughts.

server/application/terminal.go

smcavallo · 2022-05-12T19:22:08Z

@crenshaw-dev - The security scan flagged the new log lines with:

Log entries created from user input High
This log write receives unsanitized user input from here.

Since we are sending the url parameters verbatim to logs and the kubernetes API it is unsafe to write them to logs.
It is probably unsafe to send them directly to the kubernetes API as well. These as passed along to k8sClient.CoreV1().RESTClient().Post(). - not exactly sure if that client has any security built in but it would be safer to implement some additional sanitizers before sending them along.

The isValidKubernetesResourceName adds some of that - but calling that out in this PR since it was flagged by the scanner.

server/application/terminal.go

Signed-off-by: smcavallo <smcavallo@hotmail.com>

leoluz

Great PR. Tks for improving the logs. Added one suggestion.

leoluz · 2022-05-16T13:44:37Z

server/application/terminal.go

 			break
 		}
 	}
-	if !findContainer {
+	if foundContainerName == "" {
 		http.Error(w, "Cannot find container", http.StatusBadRequest)


For auditing it would be interesting to also have this logged (as a warning) with all info: cluster, namespace, pod name and container name.

CodeQL might complain about logging the un-sanitized container input. But since we're validating the container name above, I think it would be safe to log (and to override the CodeQL warning).

@leoluz - yes this is the issue we were trying to avoid - Log entries created from user input High This log write receives unsanitized user input from here.
It is unsafe to log verbatim whatever was posted to URL params which is why it is not logged.
In theory it should rarely happen as most requests should come directly from the argocd application itself and only post namespace + pod + container as already found in argocd. I agree it would be useful to be able to debug though. If a user really wants to know why it can't be found wondering if there is wireshark/tcpdump some level of capturing the http request instead?

It is unsafe to log verbatim whatever was posted to URL params which is why it is not logged.

@smcavallo but you are previously sanitizing isn't it?

Not sanitizing, but validating. Which I think should be enough.

If we're reaching this point of the code, then we've shown the user is authenticated and authorized to get the application and create on the exec resource. I'm not too worried about this user filling up the disk with this log line.

@leoluz and @crenshaw-dev - totally make sense - we've already validated so it's OK to log these. I have added the additional info to these logs.

server/application/terminal.go

Signed-off-by: smcavallo <smcavallo@hotmail.com>

server/application/terminal.go

Signed-off-by: smcavallo <smcavallo@hotmail.com>

leoluz · 2022-05-16T20:18:37Z

server/application/terminal.go

@@ -150,7 +153,8 @@ func (s *terminalHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {

 	pod, err := kubeClientset.CoreV1().Pods(namespace).Get(ctx, podName, metav1.GetOptions{})
 	if err != nil {
-		http.Error(w, "Cannot find pod: "+podName, http.StatusBadRequest)
+		fieldLog.Warn("Terminal Pod Not Found")


This is an error and should be logged like so:

fieldLog.Errorf("error retrieving pod %s: %s", podName, err)

@leoluz - thank you for checking again - we are already logging the podName. when fieldLog is called it will output ALL the fields (user, namespace, pod, container) to the log line. since it is using WithFields -

Note that it doesn't log until you call Debug, Print, Info, Warn, Fatal or Panic on the Entry it returns.
I left as "Terminal Pod Not Found" to improve the searchability and to create standardized alerts based on that key - we can assume the format of that structured log will stay the same and build alerts based on that key.

I defer to all the argocd folks but the distinction here which is important is user error vs application error.
It makes sense that this throws an http error to the user.
However it is not actually an application error - the application is performing normally.
From my perspective it would be better to be Warn instead of Error for application log level.
As an operator I would want to know that this is happening (Warn) but I would not consider this an indicator that argocd is unhealthy, having an issue, or doing something broken or unexpected.

Let me know if I should change it or if the above doesn't make sense.

Agree with the fact that we don't need the pod name as it is already part of registered fields. However I believe this still needs to be logged as an error. It is ArgoCD internal code that does the call to kube-api to retrieve the pod by its name. If this request fails for some reason it is an internal error and should be logged like so. Retrieving the pod call can fail by several reasons and logging it as terminal pod not found is misleading.
I suggest to change this log to:

fieldLog.Errorf("error retrieving pod: %s", err)

By the way, log messages should be all lowercase sentences.

@leoluz what if a Pod is in the resource tree but not synced? In that case an error from the k8s API would be a user error rather than an application error.

@leoluz - good point - I have updated it and also changed the log messages to all lowercase

@leoluz what if a Pod is in the resource tree but not synced? In that case an error from the k8s API would be a user error rather than an application error.

@crenshaw-dev My understanding is that (contrary to HTTP return codes) from the log perspective, it doesn't matter if the error was caused by the user or by something internal. Our code invoked a method that returned an error and in this case an error should be logged. However there is another "subtle" problem in this case: client-go returns an error if for example the resource isn't found. This isn't an exceptional scenario and if we want to be precise to whether or not log an error (and decide the proper http return code) we need to do an extra check:

if err != nil { if apimachineryerrors.IsNotFound(err) { // don't log error // http 404 } else { // log error // http 5xx } }

Signed-off-by: smcavallo <smcavallo@hotmail.com>

leoluz

LGTM

* feat: support pod exec terminal logging Signed-off-by: smcavallo <smcavallo@hotmail.com> * enhanced validation and logging when resource not found Signed-off-by: smcavallo <smcavallo@hotmail.com> * fix lint Signed-off-by: smcavallo <smcavallo@hotmail.com> * log warning when pod or container not found Signed-off-by: smcavallo <smcavallo@hotmail.com> * go/log-injection fixes Signed-off-by: smcavallo <smcavallo@hotmail.com> * log levels and lowercase message Signed-off-by: smcavallo <smcavallo@hotmail.com>

crenshaw-dev · 2022-05-31T16:17:10Z

Cherry-picked to 2.4.

crenshaw-dev requested changes May 12, 2022

View reviewed changes

server/application/terminal.go Outdated Show resolved Hide resolved

server/application/terminal.go Outdated Show resolved Hide resolved

crenshaw-dev added the cherry-pick/2.4 Candidate for cherry picking into the 2.4 release branch label May 12, 2022

smcavallo force-pushed the exec_logging branch from cd17d20 to 2a1dc08 Compare May 12, 2022 19:17

crenshaw-dev reviewed May 12, 2022

View reviewed changes

server/application/terminal.go Show resolved Hide resolved

smcavallo force-pushed the exec_logging branch from 2b08492 to a09a051 Compare May 13, 2022 19:02

feat: support pod exec terminal logging

74d5150

Signed-off-by: smcavallo <smcavallo@hotmail.com>

smcavallo force-pushed the exec_logging branch from a09a051 to 74d5150 Compare May 13, 2022 20:46

crenshaw-dev requested a review from leoluz May 13, 2022 20:57

leoluz requested changes May 16, 2022

View reviewed changes

crenshaw-dev requested changes May 16, 2022

View reviewed changes

server/application/terminal.go Show resolved Hide resolved

smcavallo added 2 commits May 16, 2022 10:56

enhanced validation and logging when resource not found

576831f

Signed-off-by: smcavallo <smcavallo@hotmail.com>

fix lint

1cbef91

Signed-off-by: smcavallo <smcavallo@hotmail.com>

crenshaw-dev reviewed May 16, 2022

View reviewed changes

server/application/terminal.go Outdated Show resolved Hide resolved

crenshaw-dev reviewed May 16, 2022

View reviewed changes

server/application/terminal.go Outdated Show resolved Hide resolved

crenshaw-dev reviewed May 16, 2022

View reviewed changes

server/application/terminal.go Outdated Show resolved Hide resolved

smcavallo and others added 2 commits May 16, 2022 14:31

Merge branch 'argoproj:master' into exec_logging

a72e178

log warning when pod or container not found

9d9b01a

Signed-off-by: smcavallo <smcavallo@hotmail.com>

leoluz reviewed May 16, 2022

View reviewed changes

smcavallo added 2 commits May 16, 2022 16:45

go/log-injection fixes

530867e

Signed-off-by: smcavallo <smcavallo@hotmail.com>

log levels and lowercase message

09642e7

Signed-off-by: smcavallo <smcavallo@hotmail.com>

leoluz approved these changes May 17, 2022

View reviewed changes

crenshaw-dev approved these changes May 17, 2022

View reviewed changes

crenshaw-dev merged commit 23d9cf2 into argoproj:master May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support pod exec terminal logging #9385

feat: support pod exec terminal logging #9385

smcavallo commented May 12, 2022

codecov bot commented May 12, 2022 •

edited

Loading

crenshaw-dev left a comment

smcavallo commented May 12, 2022

leoluz left a comment

leoluz May 16, 2022

crenshaw-dev May 16, 2022

smcavallo May 16, 2022

leoluz May 16, 2022

crenshaw-dev May 16, 2022

smcavallo May 16, 2022

leoluz May 16, 2022

smcavallo May 16, 2022

leoluz May 16, 2022

crenshaw-dev May 16, 2022

smcavallo May 16, 2022

leoluz May 17, 2022 •

edited

Loading

leoluz left a comment

crenshaw-dev commented May 31, 2022

feat: support pod exec terminal logging #9385

feat: support pod exec terminal logging #9385

Conversation

smcavallo commented May 12, 2022

codecov bot commented May 12, 2022 • edited Loading

Codecov Report

crenshaw-dev left a comment

Choose a reason for hiding this comment

smcavallo commented May 12, 2022

leoluz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leoluz May 17, 2022 • edited Loading

Choose a reason for hiding this comment

leoluz left a comment

Choose a reason for hiding this comment

crenshaw-dev commented May 31, 2022

codecov bot commented May 12, 2022 •

edited

Loading

leoluz May 17, 2022 •

edited

Loading