[Alerting] allow a human-readable string to be associated with an instanceId #64268

pmuellr · 2020-04-22T23:04:07Z

Currently the alerting framework uses strings as instanceIds, that an alert executor will use when scheduling action groups for execution. These instanceIds are then displayed in the alert details page, to show the state of the instances the executor has processed.

Sometimes, that works out nicely, if your instanceIds are human-readable:

But if they aren't, say they're UUIDs, it's a dog's breakfast:

We should figure out a way that an executor could supply a name when referencing an instanceId. And then make that available wherever instance information is available. That name should be the string displayed in the UI. If the name isn't supplied, then the instanceId would be displayed, as it is today.

elasticmachine · 2020-04-22T23:04:09Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

pmuellr · 2020-04-22T23:05:43Z

My first guess at an implementation would be to change the instance factory to take an optional name as an additional new parameter, in addition to id:

kibana/x-pack/plugins/alerting/server/alert_instance/create_alert_instance_factory.ts

Lines 9 to 17 in 4fc1c5f

 export function createAlertInstanceFactory(alertInstances: Record<string, AlertInstance>) { 

 return (id: string): AlertInstance => { 

 if (!alertInstances[id]) { 

 alertInstances[id] = new AlertInstance(); 

 } 

 return alertInstances[id]; 

 }; 

 }

If we think we might need more things later, we could make the second param an optional option "bag" with a name property instead.

Presumably this would be stored as a new instance field in the AlertIntance:

kibana/x-pack/plugins/alerting/common/alert_instance.ts

Lines 20 to 24 in 4fc1c5f

 export const rawAlertInstance = t.partial({ 

 state: stateSchema, 

 meta: metaSchema, 

 }); 

 export type RawAlertInstance = t.TypeOf<typeof rawAlertInstance>;

Not sure what else that will affect - saved object shapes? Certainly some external type used in the endpoint the UI calls to populate the alert details page.

pmuellr · 2020-07-06T23:42:16Z

Starting to look into this issue, here's a survey of usages of instance ids in the product:

kibana/x-pack/plugins/alerting_builtins/server/alert_types/index_threshold/alert_type.ts

Lines 127 to 144 in 5992424

 for (const groupResult of groupResults) { 

 const instanceId = groupResult.group; 

 const value = groupResult.metrics[0][1]; 

 const met = compareFn(value, params.threshold); 

 if (!met) continue; 

 const baseContext: BaseActionContext = { 

 date, 

 group: instanceId, 

 value, 

 }; 

 const actionContext = addMessages(options, baseContext, params); 

 const alertInstance = options.services.alertInstanceFactory(instanceId); 

 alertInstance.scheduleActions(ActionGroupId, actionContext); 

 logger.debug(`scheduled actionGroup: ${JSON.stringify(actionContext)}`); 

 } 

 }

The instanceId is the value used for grouping,if grouping was requested, otherwise is the string 'all documents'.

Eg, for a query where you group based on a field used to store host names, the instance id will be a host name.

kibana/x-pack/plugins/apm/server/lib/alerts/register_error_rate_alert_type.ts

Lines 122 to 124 in 5992424

 const alertInstance = services.alertInstanceFactory( 

 AlertType.ErrorRate 

 );

instanceId: apm.error_rate

kibana/x-pack/plugins/apm/server/lib/alerts/register_transaction_duration_alert_type.ts

Lines 163 to 165 in 5992424

 const alertInstance = services.alertInstanceFactory( 

 AlertType.TransactionDuration 

 );

instanceId: apm.transaction_duration

kibana/x-pack/plugins/infra/server/lib/alerting/inventory_metric_threshold/inventory_metric_threshold_executor.ts

Line 57 in 5992424

const alertInstance = services.alertInstanceFactory(`${item}::${alertId}`);

Not clear what item will end up resolving to. The ::${alertId} suffix isn't needed.

kibana/x-pack/plugins/infra/server/lib/alerting/log_threshold/log_threshold_executor.ts

Line 91 in 5992424

 const alertInstance = alertInstanceFactory(`${alertId}-${UNGROUPED_FACTORY_KEY}`); 

kibana/x-pack/plugins/infra/server/lib/alerting/log_threshold/log_threshold_executor.ts

Line 131 in 5992424

const alertInstance = alertInstanceFactory(`${alertId}-${group.name}`);

Not clear what group.name will end up resolving to. UNGROUPED_FACTORY_KEY is *. The ${alertId}- prefix isn't needed.

kibana/x-pack/plugins/infra/server/lib/alerting/metric_threshold/metric_threshold_executor.ts

Line 39 in 5992424

const alertInstance = services.alertInstanceFactory(`${group}::${alertId}`);

Not clear what group will end up resolving to. The ::${alertId} suffix isn't needed.

kibana/x-pack/plugins/monitoring/server/alerts/cluster_state.ts

Line 100 in 5992424

const instance = services.alertInstanceFactory(ALERT_TYPE_CLUSTER_STATE);

instanceId: monitoring_alert_type_cluster_state

kibana/x-pack/plugins/monitoring/server/alerts/license_expiration.ts

Line 116 in 5992424

const instance = services.alertInstanceFactory(ALERT_TYPE_LICENSE_EXPIRATION);

instanceId: monitoring_alert_type_license_expiration

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/notifications/rules_notification_alert_type.ts

Line 76 in 5992424

const alertInstance = services.alertInstanceFactory(alertId);

It appears this is the id of the alert, so a different fixed string should probably be used.

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/signals/signal_rule_alert_type.ts

Line 291 in 5992424

const alertInstance = services.alertInstanceFactory(alertId);

It appears this is the id of the alert, so a different fixed string should probably be used.

kibana/x-pack/plugins/uptime/server/lib/alerts/status_check.ts

Line 286 in 5992424

const alertInstance = options.services.alertInstanceFactory(MONITOR_STATUS.id);

instanceId: xpack.uptime.alerts.actionGroups.monitorStatus

kibana/x-pack/plugins/uptime/server/lib/alerts/tls.ts

Line 143 in 5992424

const alertInstance = alertInstanceFactory(TLS.id);

instanceId: xpack.uptime.alerts.actionGroups.tls

resolves elastic#64268 Instances can now be retrieved/created via alertInstanceFactory('instance-id', { name: 'a nice name for this instance' }); Alert instances can use `getName()` to return the name, if set. Adds new context variables for all alerts with names `alertInstanceName` and `alertInstanceNameOrId`. The former may be `undefined`. The latter is `alertInstance.name || alertInstanceId`. ---- However, from existing usage as documented in that issue, I don't think we need this quite **yet**. I think a lot of the usages of alertIds in those instance ids aren't required or desired.

pmuellr · 2020-07-07T19:46:35Z

It seems like many of these instance id strings should be changed to something more useful, however, there IS a "migration" issue. Say you changed the instance id string in 7.9, any alerts from 7.8 will switch from using the old instance id to the new one, once Kibana is running. This means:

the alert will consider this a new instance
you'll be starting with fresh instance state, since this instance id didn't exist before - the old instance state will be available for that first run in 7.9, then after that it'll be gone
if you muted that instance in 7.8, it will no longer be muted in 7.9

Some downsides, for sure, and I guess we need to factor these "instance id migrations" into our dev docs somehow. It's probably worth changing though, and it's possible you could recover the old state from 7.8 if you want (just access an instance with the old id). Muting will be hard, as the alert executor does not yet have access to the alert client (until I convince @mikecote of this! :-) ).

mikecote · 2020-07-09T12:14:16Z

I'm seeing this as another case that would seem easier if ever the alert state lived within the alert object. Saved object migrations would handle this no problem and we could allow alert types to migrate their ids this with #50216. We would also stop losing state whenever changing the enabled status of an alert. I'm sure there would be further complications with an approach like this, with the event log for example.

Zacqary · 2020-07-09T22:22:28Z

#71335 and #71340 will resolve the ${alertId} issues mentioned above and make the alert instances for Logs and Metrics much more human-readable (at least to the degree that something like a host.hostname is human-readable, sometimes they're not). But for other alert types I definitely think we could use this functionality.

pmuellr · 2020-08-14T14:06:28Z

I made another pass through the callers of alertInstanceFactory(instanceId) to look for obvious cases of "id"s being used as instance ids. It seems like the alert type implementations are getting more elaborate, it's harder to track the parameter used in that call to see what the source might be.

The only obvious ones are with security solution, here:

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/notifications/rules_notification_alert_type.ts

Lines 75 to 78 in 187a130

 if (signalsCount !== 0) { 

 const alertInstance = services.alertInstanceFactory(alertId); 

 scheduleNotificationActions({ alertInstance, signalsCount, resultsLink, ruleParams }); 

 }

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/signals/signal_rule_alert_type.ts

Lines 366 to 374 in 187a130

 if (result.createdSignalsCount) { 

 const alertInstance = services.alertInstanceFactory(alertId); 

 scheduleNotificationActions({ 

 alertInstance, 

 signalsCount: result.createdSignalsCount, 

 resultsLink, 

 ruleParams: notificationRuleParams, 

 }); 

 }

I'll contact the team off-line, in case they want to fix this - but I suspect they may not, since their usage of alerts is more of an implementation detail than exposing customer-facing alerts.

pmuellr · 2020-08-14T14:15:07Z

I'm going to close this for now, as I've been treating this issue as a kind of meta issue to see if we can get human readable instance ids into the system directly for the current set of alertTypes in Kibana, instead of having both an instance id and name.

We may want to revisit having an explicit name, as mentioned in #64268 (comment), but I think it's probably best to wait for some explicit solution or customer feedback to get more requirements.

pmuellr added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Apr 22, 2020

pmuellr mentioned this issue May 14, 2020

Nested grouping-over #66052

Closed

mikecote assigned pmuellr May 26, 2020

pmuellr closed this as completed Aug 14, 2020

pmuellr mentioned this issue Nov 5, 2020

[Trigger Actions UI] Improve alert instances view by having instance name, tags #82707

Closed

pmuellr mentioned this issue Jul 27, 2021

[Alerting][Event Log] Consider adding uuid to active alert spans #101749

Closed

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Alerting] allow a human-readable string to be associated with an instanceId #64268

[Alerting] allow a human-readable string to be associated with an instanceId #64268

pmuellr commented Apr 22, 2020

elasticmachine commented Apr 22, 2020

pmuellr commented Apr 22, 2020 •

edited

Loading

pmuellr commented Jul 6, 2020

pmuellr commented Jul 7, 2020

mikecote commented Jul 9, 2020

Zacqary commented Jul 9, 2020

pmuellr commented Aug 14, 2020

pmuellr commented Aug 14, 2020

[Alerting] allow a human-readable string to be associated with an instanceId #64268

[Alerting] allow a human-readable string to be associated with an instanceId #64268

Comments

pmuellr commented Apr 22, 2020

elasticmachine commented Apr 22, 2020

pmuellr commented Apr 22, 2020 • edited Loading

pmuellr commented Jul 6, 2020

pmuellr commented Jul 7, 2020

mikecote commented Jul 9, 2020

Zacqary commented Jul 9, 2020

pmuellr commented Aug 14, 2020

pmuellr commented Aug 14, 2020

pmuellr commented Apr 22, 2020 •

edited

Loading