Logging for failed GetSecrets in Workflows#20556
Conversation
|
I see you updated files related to
|
| getSecretsDuration := time.Since(start).Milliseconds() | ||
| if err != nil { | ||
| // Log errors when secrets fetching fails, for troubleshooting and debugging | ||
| s.lggr.Infow("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration) |
There was a problem hiding this comment.
Should this be Errorw?
There was a problem hiding this comment.
Actually do we need this -- we will already emit an error here that should cover this: https://github.com/smartcontractkit/chainlink/pull/20556/files#diff-db37590ea3c9bdb89e68e879d48c0e51c03a6a0ba2c8a50f34b35e7f6eea0f7aL210
There was a problem hiding this comment.
Hmmm there might be a small gap actually
There was a problem hiding this comment.
Errorw seems like if there's a user error, then we will litter logs with error logs.
Also, a request failing doesn't always mean something's wrong with our server.
Yes, that line you mentioned captures a lot, but there's some more error paths above, so I thought we should just capture all failures at the top level too.
There was a problem hiding this comment.
We could do a warn then -- I'm not a fan of having this as an Info because it would be hard to filter by status and find this
There was a problem hiding this comment.
Pull request overview
This PR adds error logging to the GetSecrets method in the workflow secrets fetcher to improve troubleshooting and debugging capabilities when secret fetching operations fail. The change also optimizes the duration calculation by computing it once and reusing it for both logging and metrics recording.
- Adds error logging for failed secret fetch operations with request details and latency
- Refactors duration calculation to avoid redundant
time.Since()calls
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| getSecretsDuration := time.Since(start).Milliseconds() | ||
| if err != nil { | ||
| // Log errors when secrets fetching fails, for troubleshooting and debugging | ||
| s.lggr.Infow("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration) |
There was a problem hiding this comment.
Logging the entire request object may expose sensitive information. The GetSecretsRequest contains secret identifiers (Id, Namespace, Owner) which could be considered sensitive metadata about what secrets are being accessed. Consider logging only non-sensitive fields or a sanitized representation of the request instead.
Suggested fix:
s.lggr.Errorw("Secrets fetching failed for request", "callbackId", request.CallbackId, "numRequests", len(request.Requests), "error", err, "requestLatency", getSecretsDuration)| s.lggr.Infow("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration) | |
| s.lggr.Infow("Secrets fetching failed for request", "callbackId", request.CallbackId, "numRequests", len(request.Requests), "error", err, "requestLatency", getSecretsDuration) |
| getSecretsDuration := time.Since(start).Milliseconds() | ||
| if err != nil { | ||
| // Log errors when secrets fetching fails, for troubleshooting and debugging | ||
| s.lggr.Infow("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration) |
There was a problem hiding this comment.
Error logging should use lggr.Errorw instead of lggr.Infow. Throughout the codebase (e.g., lines 214 and 223 in this same file, and in workflow_registry.go), errors are consistently logged at the Error level using lggr.Errorw. Using Info level for error conditions can lead to these failures being missed during monitoring and troubleshooting.
Suggested fix:
s.lggr.Errorw("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration)| s.lggr.Infow("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration) | |
| s.lggr.Errorw("Secrets fetching failed for request", "request", request, "error", err, "requestLatency", getSecretsDuration) |
… into secrets_logging
3477953
|
* Logging for failed GetSecrets in Workflows * nit * nit (cherry picked from commit 2eb4054)




No description provided.