Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onboard Uptime rule types with FAAD #179493

Merged

Conversation

ersin-erdal
Copy link
Contributor

@ersin-erdal ersin-erdal commented Mar 26, 2024

Towards: #169867

This PR onboards Uptime rule types (Tls, Duration Anolamy and Monitor status) with FAAD.

We are deprecating the rule-registry plugin and onboard the rule types with the new alertsClient to manage alert-as-data.
There is no new future, all the rule types should work as they were, and save alerts with all the existing fields.

To verify:

  • Switch to Kibana 8.9.0 in your local repo. (In this version Uptime rules are not deprecated)
  • Run your ES with: yarn es snapshot -E path.data=../local-es-data
  • Run your Kibana
  • Create Uptime rules with an active and a recovered action (You can run Heartbeat locally if needed, follow the instructions)
  • Stop your ES and Kibana
  • Switch to this branch and run your ES with yarn es snapshot -E path.data=../local-es-data again.
  • Run your Kibana
  • Modify Uptime rulesType codes to force them to create an alert.
    Example:
    Mock availabilityResults in status_check with below data
availabilityResults = [
      {
        monitorId: '1',
        up: 1,
        down: 0,
        location: 'location',
        availabilityRatio: 0.5,
        monitorInfo: {
          timestamp: '',
          monitor: {
            id: '1',
            status: 'down',
            type: 'type',
            check_group: 'default',
          },
          docId: 'docid',
        },
      },
    ];

It should create an alert. The alert should be saved under .alerts-observability.uptime.alerts index and be visible under observability alerts page.

Then remove the mock, the alert should be recovered.

@ersin-erdal ersin-erdal added release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.14.0 labels Mar 26, 2024
@ersin-erdal ersin-erdal self-assigned this Mar 26, 2024
@ersin-erdal
Copy link
Contributor Author

/ci

state: {
...updateState(state, foundCerts),
...summary,
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alertInstance.replaceState is moved to alertsClient.report({state})

),
...summary,
},
payload: {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fields becomes payload


alertsClient.setAlertData({
id: alertId,
context: {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alertInstance.scheduleActions is moved to alertsClient.report({context})

# Conflicts:
#	x-pack/plugins/observability_solution/uptime/server/legacy_uptime/uptime_server.ts
# Conflicts:
#	x-pack/plugins/observability_solution/uptime/server/legacy_uptime/lib/alerts/status_check.test.ts
@ersin-erdal ersin-erdal marked this pull request as ready for review April 22, 2024 14:44
@ersin-erdal ersin-erdal requested a review from a team as a code owner April 22, 2024 14:44
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review only. Overall looks good but left a few questions

});
}
};

export const uptimeRuleTypeFieldMap = { ...uptimeRuleFieldMap, ...legacyExperimentalFieldMap };

export const UptimeRuleTypeAlertDefinition: IRuleTypeAlerts = {
export const UptimeRuleTypeAlertDefinition: IRuleTypeAlerts<ObservabilityUptimeAlert> = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another definition for this in x-pack/plugins/observability_solution/synthetics/server/alert_rules/common.ts. Should we remove one definition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they have different usages.
This one is for the legacy, duration, status check and tls.
The one you shared is for synthetics monitor_status and tls.
I can remove one of them when the synthetic rules are onboarded as well. Can't use shouldWrite: true on them until then.

'monitor.id': params.monitorId,
'url.full': summary.monitorUrl,
'observer.geo.name': summary.observerLocation,
'anomaly.start': summary.anomalyStartTimestamp,
'anomaly.bucket_span.minutes': summary.bucketSpan,
'anomaly.bucket_span.minutes': `${summary.bucketSpan}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change from number to string going to cause any issues?

Copy link
Contributor Author

@ersin-erdal ersin-erdal Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get any error when i was testing but I think we better ask this to @elastic/obs-ux-infra_services-team.
uptimeRuleTypeFieldMap forces it to be string.

'anomaly.bucket_span.minutes': {
    type: 'keyword',
    required: false,
  }

I can convert it to long but not sure about their usage.

@crespocarlos can you help us here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, i changed this part and just used type casting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn on this. anomaly.bucket_span.minutes is mapped as a keyword which is why the expected type is string. I'm wondering if it's a mistake that it's mapped as keyword when the value is actually a number? I think changing it into a string via ${summary.bucketSpan} is probably the correct thing to do to match the mapping as long as it doesn't cause any errors when updating existing alerts, which it sounds like it doesn't.

Copy link
Contributor Author

@ersin-erdal ersin-erdal May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mapping is wrong.
bucket_span is definitely number. And it has been being passed as number even though defined as keyword in the mapping. It just works because JS is not type safe at run time.
I'd rather leave it as it is since the rule type is deprecated.

const indexedStartedAt = getAlertStartedDate(recoveredAlertId) ?? defaultStartedAt;

const state = alert.getState();
for await (const recoveredAlert of alertsClient?.getRecoveredAlerts() ?? []) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this await necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't realize it was there.
I don't think so, for-loop is already a blocking op. Removing it.

@botelastic botelastic bot added the ci:project-deploy-observability Create an Observability project label Apr 29, 2024
@kibana-ci
Copy link
Collaborator

kibana-ci commented May 2, 2024

💚 Build Succeeded

Metrics [docs]

Canvas Sharable Runtime

The Canvas "shareable runtime" is an bundle produced to enable running Canvas workpads outside of Kibana. This bundle is included in third-party webpages that embed canvas and therefor should be as slim as possible.

id before after diff
module count - 5407 +5407
total size - 9.1MB +9.1MB
Unknown metric groups

References to deprecated APIs

id before after diff
uptime 5 1 -4

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ersin-erdal

@ersin-erdal ersin-erdal merged commit d228f48 into elastic:main May 2, 2024
18 checks passed
@ersin-erdal ersin-erdal deleted the 169867-onboard-uptime-rule-types branch May 2, 2024 15:54
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request May 2, 2024
Towards: elastic#169867

This PR onboards Uptime rule types (Tls, Duration Anolamy and Monitor
status) with FAAD.

We are deprecating the rule-registry plugin and onboard the rule types
with the new alertsClient to manage alert-as-data.
There is no new future, all the rule types should work as they were, and
save alerts with all the existing fields.

## To verify:

- Switch to Kibana 8.9.0 in your local repo. (In this version Uptime
rules are not deprecated)
- Run your ES with: `yarn es snapshot -E path.data=../local-es-data`
- Run your Kibana
- Create Uptime rules with an active and a recovered action (You can run
Heartbeat locally if needed, [follow the
instructions](https://www.elastic.co/guide/en/beats/heartbeat/current/heartbeat-installation-configuration.html))
- Stop your ES and Kibana
- Switch to this branch and run your ES with `yarn es snapshot -E
path.data=../local-es-data` again.
- Run your Kibana
- Modify Uptime rulesType codes to force them to create an alert.
Example:
Mock [availabilityResults in
status_check](https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/uptime/server/legacy_uptime/lib/alerts/status_check.ts#L491)
with below data
```
availabilityResults = [
      {
        monitorId: '1',
        up: 1,
        down: 0,
        location: 'location',
        availabilityRatio: 0.5,
        monitorInfo: {
          timestamp: '',
          monitor: {
            id: '1',
            status: 'down',
            type: 'type',
            check_group: 'default',
          },
          docId: 'docid',
        },
      },
    ];
```

It should create an alert. The alert should be saved under
`.alerts-observability.uptime.alerts` index and be visible under
observability alerts page.

Then remove the mock, the alert should be recovered.

(cherry picked from commit d228f48)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.14

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request May 2, 2024
# Backport

This will backport the following commits from `main` to `8.14`:
- [Onboard Uptime rule types with FAAD
(#179493)](#179493)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Ersin
Erdal","email":"92688503+ersin-erdal@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-05-02T15:53:56Z","message":"Onboard
Uptime rule types with FAAD (#179493)\n\nTowards:
https://github.com/elastic/kibana/issues/169867\r\n\r\nThis PR onboards
Uptime rule types (Tls, Duration Anolamy and Monitor\r\nstatus) with
FAAD.\r\n\r\nWe are deprecating the rule-registry plugin and onboard the
rule types\r\nwith the new alertsClient to manage
alert-as-data.\r\nThere is no new future, all the rule types should work
as they were, and\r\nsave alerts with all the existing fields.\r\n\r\n##
To verify:\r\n\r\n- Switch to Kibana 8.9.0 in your local repo. (In this
version Uptime\r\nrules are not deprecated)\r\n- Run your ES with: `yarn
es snapshot -E path.data=../local-es-data`\r\n- Run your Kibana\r\n-
Create Uptime rules with an active and a recovered action (You can
run\r\nHeartbeat locally if needed, [follow
the\r\ninstructions](https://www.elastic.co/guide/en/beats/heartbeat/current/heartbeat-installation-configuration.html))\r\n-
Stop your ES and Kibana\r\n- Switch to this branch and run your ES with
`yarn es snapshot -E\r\npath.data=../local-es-data` again.\r\n- Run your
Kibana\r\n- Modify Uptime rulesType codes to force them to create an
alert.\r\nExample:\r\nMock [availabilityResults
in\r\nstatus_check](https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/uptime/server/legacy_uptime/lib/alerts/status_check.ts#L491)\r\nwith
below data\r\n```\r\navailabilityResults = [\r\n {\r\n monitorId:
'1',\r\n up: 1,\r\n down: 0,\r\n location: 'location',\r\n
availabilityRatio: 0.5,\r\n monitorInfo: {\r\n timestamp: '',\r\n
monitor: {\r\n id: '1',\r\n status: 'down',\r\n type: 'type',\r\n
check_group: 'default',\r\n },\r\n docId: 'docid',\r\n },\r\n },\r\n
];\r\n```\r\n\r\nIt should create an alert. The alert should be saved
under\r\n`.alerts-observability.uptime.alerts` index and be visible
under\r\nobservability alerts page.\r\n\r\nThen remove the mock, the
alert should be
recovered.","sha":"d228f488ec0456c96b3e06aee57a0d28af851eb4","branchLabelMapping":{"^v8.15.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:ResponseOps","ci:project-deploy-observability","apm:review","v8.14.0","v8.15.0"],"title":"Onboard
Uptime rule types with
FAAD","number":179493,"url":"https://github.com/elastic/kibana/pull/179493","mergeCommit":{"message":"Onboard
Uptime rule types with FAAD (#179493)\n\nTowards:
https://github.com/elastic/kibana/issues/169867\r\n\r\nThis PR onboards
Uptime rule types (Tls, Duration Anolamy and Monitor\r\nstatus) with
FAAD.\r\n\r\nWe are deprecating the rule-registry plugin and onboard the
rule types\r\nwith the new alertsClient to manage
alert-as-data.\r\nThere is no new future, all the rule types should work
as they were, and\r\nsave alerts with all the existing fields.\r\n\r\n##
To verify:\r\n\r\n- Switch to Kibana 8.9.0 in your local repo. (In this
version Uptime\r\nrules are not deprecated)\r\n- Run your ES with: `yarn
es snapshot -E path.data=../local-es-data`\r\n- Run your Kibana\r\n-
Create Uptime rules with an active and a recovered action (You can
run\r\nHeartbeat locally if needed, [follow
the\r\ninstructions](https://www.elastic.co/guide/en/beats/heartbeat/current/heartbeat-installation-configuration.html))\r\n-
Stop your ES and Kibana\r\n- Switch to this branch and run your ES with
`yarn es snapshot -E\r\npath.data=../local-es-data` again.\r\n- Run your
Kibana\r\n- Modify Uptime rulesType codes to force them to create an
alert.\r\nExample:\r\nMock [availabilityResults
in\r\nstatus_check](https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/uptime/server/legacy_uptime/lib/alerts/status_check.ts#L491)\r\nwith
below data\r\n```\r\navailabilityResults = [\r\n {\r\n monitorId:
'1',\r\n up: 1,\r\n down: 0,\r\n location: 'location',\r\n
availabilityRatio: 0.5,\r\n monitorInfo: {\r\n timestamp: '',\r\n
monitor: {\r\n id: '1',\r\n status: 'down',\r\n type: 'type',\r\n
check_group: 'default',\r\n },\r\n docId: 'docid',\r\n },\r\n },\r\n
];\r\n```\r\n\r\nIt should create an alert. The alert should be saved
under\r\n`.alerts-observability.uptime.alerts` index and be visible
under\r\nobservability alerts page.\r\n\r\nThen remove the mock, the
alert should be
recovered.","sha":"d228f488ec0456c96b3e06aee57a0d28af851eb4"}},"sourceBranch":"main","suggestedTargetBranches":["8.14"],"targetPullRequestStates":[{"branch":"8.14","label":"v8.14.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.15.0","branchLabelMappingKey":"^v8.15.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/179493","number":179493,"mergeCommit":{"message":"Onboard
Uptime rule types with FAAD (#179493)\n\nTowards:
https://github.com/elastic/kibana/issues/169867\r\n\r\nThis PR onboards
Uptime rule types (Tls, Duration Anolamy and Monitor\r\nstatus) with
FAAD.\r\n\r\nWe are deprecating the rule-registry plugin and onboard the
rule types\r\nwith the new alertsClient to manage
alert-as-data.\r\nThere is no new future, all the rule types should work
as they were, and\r\nsave alerts with all the existing fields.\r\n\r\n##
To verify:\r\n\r\n- Switch to Kibana 8.9.0 in your local repo. (In this
version Uptime\r\nrules are not deprecated)\r\n- Run your ES with: `yarn
es snapshot -E path.data=../local-es-data`\r\n- Run your Kibana\r\n-
Create Uptime rules with an active and a recovered action (You can
run\r\nHeartbeat locally if needed, [follow
the\r\ninstructions](https://www.elastic.co/guide/en/beats/heartbeat/current/heartbeat-installation-configuration.html))\r\n-
Stop your ES and Kibana\r\n- Switch to this branch and run your ES with
`yarn es snapshot -E\r\npath.data=../local-es-data` again.\r\n- Run your
Kibana\r\n- Modify Uptime rulesType codes to force them to create an
alert.\r\nExample:\r\nMock [availabilityResults
in\r\nstatus_check](https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/uptime/server/legacy_uptime/lib/alerts/status_check.ts#L491)\r\nwith
below data\r\n```\r\navailabilityResults = [\r\n {\r\n monitorId:
'1',\r\n up: 1,\r\n down: 0,\r\n location: 'location',\r\n
availabilityRatio: 0.5,\r\n monitorInfo: {\r\n timestamp: '',\r\n
monitor: {\r\n id: '1',\r\n status: 'down',\r\n type: 'type',\r\n
check_group: 'default',\r\n },\r\n docId: 'docid',\r\n },\r\n },\r\n
];\r\n```\r\n\r\nIt should create an alert. The alert should be saved
under\r\n`.alerts-observability.uptime.alerts` index and be visible
under\r\nobservability alerts page.\r\n\r\nThen remove the mock, the
alert should be
recovered.","sha":"d228f488ec0456c96b3e06aee57a0d28af851eb4"}}]}]
BACKPORT-->

Co-authored-by: Ersin Erdal <92688503+ersin-erdal@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:review ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.14.0 v8.15.0
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

7 participants