Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Alerting] Should framework alerts as data write out data flattened keys #166946

Closed
ymao1 opened this issue Sep 21, 2023 · 3 comments · Fixed by #167439 or #167691
Closed
Assignees
Labels
Feature:Alerting/Alerts-as-Data Issues related to Alerts-as-data and RuleRegistry Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@ymao1
Copy link
Contributor

ymao1 commented Sep 21, 2023

Currently, the rule registry writes out documents with flattened keys like:

{
  "kibana.alert.rule.name": "test"
}

The alerting framework writes out documents as an object

{
  "kibana": {
    "alert": {
      "rule": {
        "name": "test"
      }
    }
  }
}

When considering this, we looked at what might happen if we had mixed formats in the same index, where doc1 was flattened and doc2 was unflattened and it seemed to make no difference when reading. Reading and writing sources as objects also allow us to apply typescript typings to the ES search requests.

However, when performing an update by query on these documents, we could potentially get into a mixed state within the same document. For example, updating an object source with flattened keys:

{
  "kibana.alert.status": "untracked",
  "kibana": {
    "alert": {
      "status": "active"
    }
  }
}

We would have to account for this scenario in the painless scripts we use to update alerts, which is possible but is it necessary? Currently we have just the stack rules onboarded onto the framework alerts-as-data so the only manual update that would be performed on them would be in this new PR #164788. Should we switch to writing flattened alerts before onboarding more rule types?

@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 21, 2023
@ymao1 ymao1 added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Feature:Alerting/Alerts-as-Data Issues related to Alerts-as-data and RuleRegistry labels Sep 21, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@ymao1
Copy link
Contributor Author

ymao1 commented Sep 26, 2023

Currently, AAD docs are updated in a few places:

  • kibana.alert.case_ids - this field is not populated when an alert doc is first created, so updating this field may result in a doc with mixed flatted & unflattened keys (in the case of an ES query alert visible in the Observability alerts table) but no conflicting information within the doc
  • kibana.alert.workflow_status - this field is updated in two places
    • as part of a cases update - this update by query checks whether ctx._source['${ALERT_WORKFLOW_STATUS}'] is null so it will not update alert documents that are unflattened; this means we will not have conflicting information within a doc but ES query AAD docs will not have its workflow status correctly updated when its associated case is updated.
    • as part of security solutions open/close signals route - this only updates detection rules AAD, which are currently all flattened keys so no risk of conflicting information within a doc
  • kibana.alert.workflow_tags - this is updated as part of security solutions set alert tags route which only updates detection rule AAD docs, which are currently all flattened keys so no risk of conflicting information within a doc
  • kibana.alert.status - this is included in this PR and checks for flattened vs unflattened keys so no risk of conflicting information within the doc

@ymao1 ymao1 self-assigned this Sep 26, 2023
@ymao1 ymao1 moved this from Todo to In Progress in AppEx: ResponseOps - Execution & Connectors Sep 26, 2023
@ymao1
Copy link
Contributor Author

ymao1 commented Sep 27, 2023

Going to split this into 2 PRs:

  1. Change the framework alerts client to write and expect flattened alert data - [Response Ops][Alerting] Update framework alerts client to write flattened alerts docs #167439
  2. Followup to handle the edge case/bug we'll have for the ES query rule type where there may be unflattened docs written in 8.10 that need to be updated (for ongoing active alerts and recovered alerts). These will currently not show up anywhere in the UI (only 8.11 created ES query rules with metrics or logs consumer will show up in O11y alert table) and there is no stack alerts table yet to display the 8.10 created alerts.

ymao1 added a commit that referenced this issue Sep 29, 2023
…tened alerts docs (#167439)

Resolves #166946

## Summary

The rule registry has traditionally written out AAD docs with flattened
keys, like

```
{
  "kibana.alert.rule.name": "test"
}
```

The framework alerts client has been writing out AAD docs as objects,
like

```
{
  "kibana": {
    "alert": {
      "rule": {
        "name": "test"
      }
    }
  }
}
```

We've identified a few places where we're updating the docs where having
this divergence makes things more difficult, so this is to switch the
framework to writing flattened alert docs before onboarding more rule
types.

This PR is targeted for 8.11, which is also when we onboarded the index
threshold rule type to FAAD. The only other rule type using FAAD to
write docs is ES query, which landed in 8.10 so there will be a followup
issue to handle the case of updating unflattened ES query AAD docs from
8.10

## To Verify

### ES Query and Index Threshold AaD

Create these rules that trigger alerts and verify that their AaD docs
are written out as flattened. For the ES Query rule type, select a
Metrics/Logs consumer and verify that they appear on the O11y alerts
table.

### ML alerts

ML alerts added in #166349 looked
like:

<details>
  <summary>Unflattened</summary>

```
{
	"kibana": {
		"alert": {
			"url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T14%3A57%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A17%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
			"reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
			"job_id": "rt-anomaly-mean-value",
			"anomaly_score": 73.63508175828011,
			"is_interim": false,
			"anomaly_timestamp": 1695913620000,
			"top_records": [{
				"job_id": "rt-anomaly-mean-value",
				"record_score": 73.63516446528412,
				"initial_record_score": 73.63516446528412,
				"detector_index": 0,
				"is_interim": false,
				"timestamp": 1695913620000,
				"partition_field_name": "key",
				"partition_field_value": "third-key",
				"function": "mean",
				"actual": [
					3
				],
				"typical": [
					4.187715468532429
				]
			}],
			"top_influencers": [{
				"job_id": "rt-anomaly-mean-value",
				"influencer_field_name": "key",
				"influencer_field_value": "third-key",
				"influencer_score": 73.63508175828011,
				"initial_influencer_score": 73.63508175828011,
				"is_interim": false,
				"timestamp": 1695913620000
			}],
			"action_group": "anomaly_score_match",
			"flapping": false,
			"flapping_history": [
				true,
				false,
				false,
				false
			],
			"instance": {
				"id": "rt-anomaly-mean-value"
			},
			"maintenance_window_ids": [],
			"rule": {
				"category": "Anomaly detection alert",
				"consumer": "alerts",
				"execution": {
					"uuid": "e9e681d4-c8e4-43eb-82e5-a58bdf7ffe12"
				},
				"name": "rt-ad-alert-influencer",
				"parameters": {
					"severity": 5,
					"resultType": "influencer",
					"includeInterim": false,
					"jobSelection": {
						"jobIds": [
							"rt-anomaly-mean-value"
						],
						"groupIds": []
					},
					"lookbackInterval": null,
					"topNBuckets": null
				},
				"producer": "ml",
				"revision": 0,
				"rule_type_id": "xpack.ml.anomaly_detection_alert",
				"tags": [],
				"uuid": "9e1d6bc0-5e10-11ee-8416-3bf48cca0922"
			},
			"status": "active",
			"uuid": "c9c1f075-9985-4c55-8ff8-22349cb30269",
			"workflow_status": "open",
			"duration": {
				"us": "99021000000"
			},
			"start": "2023-09-28T15:07:12.868Z",
			"time_range": {
				"gte": "2023-09-28T15:07:12.868Z"
			}
		},
		"space_ids": [
			"default"
		],
		"version": "8.11.0"
	},
	"@timestamp": "2023-09-28T15:08:51.889Z",
	"event": {
		"action": "active",
		"kind": "signal"
	},
	"tags": []
}
```
</details>

Now they look like:

<details>
  <summary>Flattened</summary>

```
{
	"kibana.alert.url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T15%3A03%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A23%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
	"kibana.alert.reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
	"kibana.alert.job_id": "rt-anomaly-mean-value",
	"kibana.alert.anomaly_score": 72.75515452061356,
	"kibana.alert.is_interim": false,
	"kibana.alert.anomaly_timestamp": 1695913980000,
	"kibana.alert.top_records": [{
		"job_id": "rt-anomaly-mean-value",
		"record_score": 72.75515452061356,
		"initial_record_score": 72.75515452061356,
		"detector_index": 0,
		"is_interim": false,
		"timestamp": 1695913980000,
		"partition_field_name": "key",
		"partition_field_value": "third-key",
		"function": "mean",
		"actual": [
			0.5
		],
		"typical": [
			4.138745343296527
		]
	}],
	"kibana.alert.top_influencers": [{
		"job_id": "rt-anomaly-mean-value",
		"influencer_field_name": "key",
		"influencer_field_value": "third-key",
		"influencer_score": 72.75515452061356,
		"initial_influencer_score": 72.75515452061356,
		"is_interim": false,
		"timestamp": 1695913980000
	}],
	"kibana.alert.rule.category": "Anomaly detection alert",
	"kibana.alert.rule.consumer": "alerts",
	"kibana.alert.rule.execution.uuid": "17fef3d3-d595-4362-837e-b2a73650169e",
	"kibana.alert.rule.name": "rt-ad-alert-influencer",
	"kibana.alert.rule.parameters": {
		"severity": 5,
		"resultType": "influencer",
		"includeInterim": false,
		"jobSelection": {
			"jobIds": [
				"rt-anomaly-mean-value"
			],
			"groupIds": []
		},
		"lookbackInterval": null,
		"topNBuckets": null
	},
	"kibana.alert.rule.producer": "ml",
	"kibana.alert.rule.revision": 0,
	"kibana.alert.rule.rule_type_id": "xpack.ml.anomaly_detection_alert",
	"kibana.alert.rule.tags": [],
	"kibana.alert.rule.uuid": "757c7610-5e11-11ee-8bc6-a95c3ced4757",
	"kibana.space_ids": [
		"default"
	],
	"@timestamp": "2023-09-28T15:14:52.057Z",
	"event.action": "active",
	"event.kind": "signal",
	"kibana.alert.action_group": "anomaly_score_match",
	"kibana.alert.flapping": false,
	"kibana.alert.flapping_history": [
		true,
		false,
		false,
		false
	],
	"kibana.alert.instance.id": "rt-anomaly-mean-value",
	"kibana.alert.maintenance_window_ids": [],
	"kibana.alert.status": "active",
	"kibana.alert.uuid": "ac1f0d7c-461b-4fc6-b4c3-04416ac876d3",
	"kibana.alert.workflow_status": "open",
	"kibana.alert.duration.us": "99028000000",
	"kibana.alert.start": "2023-09-28T15:13:13.028Z",
	"kibana.alert.time_range": {
		"gte": "2023-09-28T15:13:13.028Z"
	},
	"kibana.version": "8.11.0",
	"tags": []
}
```
</details>
@ymao1 ymao1 moved this from In Progress to In Review in AppEx: ResponseOps - Execution & Connectors Oct 2, 2023
ymao1 added a commit that referenced this issue Oct 2, 2023
…tened alerts docs (#167691)

Resolves #166946

## PRs to this feature branch
* #167439
* #167583

## Summary

The rule registry has traditionally written out AAD docs with flattened
keys, like

```
{
  "kibana.alert.rule.name": "test"
}
```

The framework alerts client has been writing out AAD docs as objects,
like

```
{
  "kibana": {
    "alert": {
      "rule": {
        "name": "test"
      }
    }
  }
}
```

We've identified a few places where we're updating the docs where having
this divergence makes things more difficult, so this is to switch the
framework to writing flattened alert docs before onboarding more rule
types.

This PR is targeted for 8.11, which is also when we onboarded the index
threshold rule type and the ML anomaly detection rule type to FAAD. For
the ES query rule, which started writing unflattened AaD docs in 8.10,
this PR adds special handling to ensure that those unflattened docs are
correctly updated with flattened fields.

## To Verify

### ES Query and Index Threshold AaD

Create these rules that trigger alerts and verify that their AaD docs
are written out as flattened. For the ES Query rule type, select a
Metrics/Logs consumer and verify that they appear on the O11y alerts
table.

### ML alerts

ML alerts added in #166349 looked
like:

<details>
  <summary>Unflattened</summary>

```
{
	"kibana": {
		"alert": {
			"url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T14%3A57%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A17%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
			"reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
			"job_id": "rt-anomaly-mean-value",
			"anomaly_score": 73.63508175828011,
			"is_interim": false,
			"anomaly_timestamp": 1695913620000,
			"top_records": [{
				"job_id": "rt-anomaly-mean-value",
				"record_score": 73.63516446528412,
				"initial_record_score": 73.63516446528412,
				"detector_index": 0,
				"is_interim": false,
				"timestamp": 1695913620000,
				"partition_field_name": "key",
				"partition_field_value": "third-key",
				"function": "mean",
				"actual": [
					3
				],
				"typical": [
					4.187715468532429
				]
			}],
			"top_influencers": [{
				"job_id": "rt-anomaly-mean-value",
				"influencer_field_name": "key",
				"influencer_field_value": "third-key",
				"influencer_score": 73.63508175828011,
				"initial_influencer_score": 73.63508175828011,
				"is_interim": false,
				"timestamp": 1695913620000
			}],
			"action_group": "anomaly_score_match",
			"flapping": false,
			"flapping_history": [
				true,
				false,
				false,
				false
			],
			"instance": {
				"id": "rt-anomaly-mean-value"
			},
			"maintenance_window_ids": [],
			"rule": {
				"category": "Anomaly detection alert",
				"consumer": "alerts",
				"execution": {
					"uuid": "e9e681d4-c8e4-43eb-82e5-a58bdf7ffe12"
				},
				"name": "rt-ad-alert-influencer",
				"parameters": {
					"severity": 5,
					"resultType": "influencer",
					"includeInterim": false,
					"jobSelection": {
						"jobIds": [
							"rt-anomaly-mean-value"
						],
						"groupIds": []
					},
					"lookbackInterval": null,
					"topNBuckets": null
				},
				"producer": "ml",
				"revision": 0,
				"rule_type_id": "xpack.ml.anomaly_detection_alert",
				"tags": [],
				"uuid": "9e1d6bc0-5e10-11ee-8416-3bf48cca0922"
			},
			"status": "active",
			"uuid": "c9c1f075-9985-4c55-8ff8-22349cb30269",
			"workflow_status": "open",
			"duration": {
				"us": "99021000000"
			},
			"start": "2023-09-28T15:07:12.868Z",
			"time_range": {
				"gte": "2023-09-28T15:07:12.868Z"
			}
		},
		"space_ids": [
			"default"
		],
		"version": "8.11.0"
	},
	"@timestamp": "2023-09-28T15:08:51.889Z",
	"event": {
		"action": "active",
		"kind": "signal"
	},
	"tags": []
}
```
</details>

Now they look like:

<details>
  <summary>Flattened</summary>

```
{
	"kibana.alert.url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T15%3A03%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A23%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
	"kibana.alert.reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
	"kibana.alert.job_id": "rt-anomaly-mean-value",
	"kibana.alert.anomaly_score": 72.75515452061356,
	"kibana.alert.is_interim": false,
	"kibana.alert.anomaly_timestamp": 1695913980000,
	"kibana.alert.top_records": [{
		"job_id": "rt-anomaly-mean-value",
		"record_score": 72.75515452061356,
		"initial_record_score": 72.75515452061356,
		"detector_index": 0,
		"is_interim": false,
		"timestamp": 1695913980000,
		"partition_field_name": "key",
		"partition_field_value": "third-key",
		"function": "mean",
		"actual": [
			0.5
		],
		"typical": [
			4.138745343296527
		]
	}],
	"kibana.alert.top_influencers": [{
		"job_id": "rt-anomaly-mean-value",
		"influencer_field_name": "key",
		"influencer_field_value": "third-key",
		"influencer_score": 72.75515452061356,
		"initial_influencer_score": 72.75515452061356,
		"is_interim": false,
		"timestamp": 1695913980000
	}],
	"kibana.alert.rule.category": "Anomaly detection alert",
	"kibana.alert.rule.consumer": "alerts",
	"kibana.alert.rule.execution.uuid": "17fef3d3-d595-4362-837e-b2a73650169e",
	"kibana.alert.rule.name": "rt-ad-alert-influencer",
	"kibana.alert.rule.parameters": {
		"severity": 5,
		"resultType": "influencer",
		"includeInterim": false,
		"jobSelection": {
			"jobIds": [
				"rt-anomaly-mean-value"
			],
			"groupIds": []
		},
		"lookbackInterval": null,
		"topNBuckets": null
	},
	"kibana.alert.rule.producer": "ml",
	"kibana.alert.rule.revision": 0,
	"kibana.alert.rule.rule_type_id": "xpack.ml.anomaly_detection_alert",
	"kibana.alert.rule.tags": [],
	"kibana.alert.rule.uuid": "757c7610-5e11-11ee-8bc6-a95c3ced4757",
	"kibana.space_ids": [
		"default"
	],
	"@timestamp": "2023-09-28T15:14:52.057Z",
	"event.action": "active",
	"event.kind": "signal",
	"kibana.alert.action_group": "anomaly_score_match",
	"kibana.alert.flapping": false,
	"kibana.alert.flapping_history": [
		true,
		false,
		false,
		false
	],
	"kibana.alert.instance.id": "rt-anomaly-mean-value",
	"kibana.alert.maintenance_window_ids": [],
	"kibana.alert.status": "active",
	"kibana.alert.uuid": "ac1f0d7c-461b-4fc6-b4c3-04416ac876d3",
	"kibana.alert.workflow_status": "open",
	"kibana.alert.duration.us": "99028000000",
	"kibana.alert.start": "2023-09-28T15:13:13.028Z",
	"kibana.alert.time_range": {
		"gte": "2023-09-28T15:13:13.028Z"
	},
	"kibana.version": "8.11.0",
	"tags": []
}
```
</details>

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting/Alerts-as-Data Issues related to Alerts-as-data and RuleRegistry Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
2 participants