-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting & Actions ] More debug logging #84670
Conversation
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM! This will be useful for debugging purposes!
const recoveredAlertInstances = pickBy( | ||
alertInstances, | ||
(alertInstance: AlertInstance) => !alertInstance.hasScheduledActions() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 👏 we could also leverage this logic within generateNewAndRecoveredInstanceEvents
.
const throttled = alertInstance.isThrottled(throttle); | ||
const muted = mutedInstanceIdsSet.has(alertInstanceName); | ||
const shouldExecuteAction = !throttled && !muted; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 awesome, much more descriptive!
@@ -332,12 +354,15 @@ export class TaskRunner { | |||
schedule: taskSchedule, | |||
} = this.taskInstance; | |||
|
|||
const runDate = new Date().toISOString(); | |||
this.logger.debug(`executing alert ${this.alertType.id}:${alertId} at ${runDate}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the runDate
added to match the Kibana logs with the event log data timestamp when debugging?
@@ -120,6 +120,8 @@ export class ActionExecutor { | |||
} | |||
|
|||
const actionLabel = `${actionTypeId}:${actionId}: ${name}`; | |||
logger.debug(`executing action ${actionLabel} at ${new Date().toISOString()}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the logger prefixing all the logs with a timestamp, is there a reason for the extra at ${new Date().toISOString()}
here?
I was also wondering if we should log (as debug) the raw errors the connectors receive when they encounter an error. I know some of them will do some processing on the error to ensure we don't return sensitive information (ex: credentials) but maybe using debug is ok? I'm not sure.. cc @kobelb @pmuellr I'm thinking the scenario of the webhook connector. Today if you put an invalid URL, it will give a generic message and no indicator that it can't communicate with the external service (see below). It then becomes hard to debug this for a customer. Other scenarios this could happen is if the customer forgets to set proxy settings or didn't configure their network properly w/ external communications. |
As long as we're reasonably confident that the error messages won't include credentials, this seems reasonable to me. |
I haven't seen cases where credentials are leaked by any services, directly. But, in some cases like Slack, the secret is also the primary URL (contains a token in the URL path), so we'd just need to be sensitive in any Slack action implementation code not to print the URL out in a message - since it might not be clear from the context that it's not just a URL, but also a secret. Webhook is probably the worst case, since there might be passwords/tokens/etc in non-secret config or params - eg, if you used webhook to post slack messages, the url for the webhook will contain a slack api token. Our error messages there are also not great, because, how do we really know what to pull from the webhook response as an "error message"? Or even know it's an error in the first place if the webhook returns errors as 200 responses (it happens). It might be a nice enhancement to have a mode to run actions, where we provided more of the response info back to the user. Eg, if you're using the web ui and the "test connector" functionality. |
In this situation, are the end-users that are authorized to view the webhook connector able to view the password/tokens/etc in non-secret params? If so, we're already "leaking" this information and aren't arguably making it any worse... |
True, we aren't making that specific case worse. We could be making it worse by writing that URL into the Kibana log though. Of all the connectors, webhook is the hardest to "lock down" in terms of providing useful feedback, since it is (by definition) open ended. |
Yikes! Bad merge. Sorry for the notifications. |
💔 Build Failed
Failed CI StepsTest FailuresX-Pack Jest Tests.x-pack/plugins/actions/server/lib.successfully executesStandard Out
Stack Trace
Chrome X-Pack UI Functional Tests.x-pack/test/functional/apps/uptime/locations·ts.Uptime app with generated data Observer location displays less monitor availabilityStandard Out
Stack Trace
Metrics [docs]Module Count
Async chunks
Distributable file count
Page load bundle
Saved Objects .kibana field count
Unknown metric groups@kbn/ui-shared-deps asset size
async chunk count
History
To update your PR or re-run it, just comment with: |
Resolves #49401
Summary
Added debug logging for:
muteAll
mute
orthrottle
Verified that error logging for alert and action execution errors log any detailed messages that bubble up