-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cases] Case action #168369
[Cases] Case action #168369
Conversation
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary This PR is a continuation of the work for the Case action. This PR implements the basic logic of the case connector. Specifically: 1. Group the alerts based on the grouping provided by the user 2. Create the Oracle's SO IDs to fetch the records. If they do not exist they will get created and the counter will be set to 1. 3. Create the cases' SO IDs to fetch the Cases. If they do not exist they will get created. 4. Attach the alerts to the corresponding cases. Not in this PR: - Handle errors - Retries on errors - Reopen cases - Time window - Race conditions - Circuit breakers Depends on: #168370, #169484 ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary Depends on: #171754 ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary This PR: 1. Creates the `CasesConnectorError` error 2. Separate the execution logic by moving the current logic to a new class called `CasesConnectorExecutor` 3. Let the `CasesConnector` class handle only the retry logic of the connector 4. Implements the [Full jitter backoff algorithm](https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/) which is used as the retry strategy of the connector Depends on: #172709 ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary This PR adds logging to the case action ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a weird bug, created a log threshold rule in o11y, alert overview shows created case in o11y. But it created a case in stack management, because the consumer for log threshold ruleType is alerts
.
obs-stack.case.action.conflict.mov
Interesting bug. @XavierM recommended fallback to the |
} | ||
|
||
this.logger.debug( | ||
`[CasesConnector][CasesConnectorExecutor][attachAlertsToCases] Attaching alerts to ${casesUnderAlertLimit.length} cases that do not have reach the alert limit per case`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
`[CasesConnector][CasesConnectorExecutor][attachAlertsToCases] Attaching alerts to ${casesUnderAlertLimit.length} cases that do not have reach the alert limit per case`, | |
`[CasesConnector][CasesConnectorExecutor][attachAlertsToCases] Attaching alerts to ${casesUnderAlertLimit.length} cases that have not reached the alert limit per case`, |
: params.rule.name; | ||
|
||
const groupingDescription = this.getGroupingDescription(grouping); | ||
const description = `This case is auto-created by ${ruleName}. \n\n Grouping: ${groupingDescription}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
const description = `This case is auto-created by ${ruleName}. \n\n Grouping: ${groupingDescription}`; | |
const description = `This case is created automatically by ${ruleName}. \n\n Grouping: ${groupingDescription}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shani suggested some changes in the description too. I was thinking of doing it on another PR. Do you mind if we also address this in the next PR?
Some updates:
|
💚 Build Succeeded
Metrics [docs]Module Count
Public APIs missing comments
Async chunks
Page load bundle
Unknown metric groupsAPI count
async chunk count
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: cc @cnasikas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What an amazing work!! 🤯 🎉 👏
Finally case action is here!! 📢 😃
Thank you for detailed scenarios, it made testing easier ❤️
Create a rule with a case action in observability and the stack. The security solution is not supported. You should not be able to assign a case action in a security solution rule. ✅
Test the "Reopen closed cases" configuration. ✅
Test the "Grouping by" configuration. Only one field is allowed. Not all fields are persisted in alerts. If you select a field not part of the alert the case action will create a case where the grouping value is set to unknow.
only single field is allowed, grouping value is unknown for non alert fields ✅ - we still need to prevent new field selection which is not in the list which @adcoelho mentioned, could be done in another PR.
Test the "Time window" feature. You can comment out the validation to test for shorter times. ✅
Verify that the case action is experimental. ✅
Verify that based on the rule type the case is created in the correct solution.
worked as expected, except the bug with log threshold rule with alerts as consumer
Verify that you cannot create a rule with the case action on the basic license. ✅
Verify that the execution of the case action fails if you do not have permission for cases. Pending work on the system actions framework level to not allow users to create rules with system actions where they do not have permission.
verified with none or read permission, in both scenario case action is disabled ✅
Stress test the case action by creating multiple rules. ✅
x-pack/plugins/cases/public/components/system_actions/cases/translations.ts
Show resolved
Hide resolved
Thanks for that! I totally forgot it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
## Summary Creates a system connector that can call the observability ai assistant to execute actions on behalf of user. The connector is tagged as tech preview. The connector can be triggered when an alert fires. Connector can be configured with an initial message to the assistant which generates an answer and triggers potential actions on the assistant side. The current experimental scenario is to ask the assistant to generate a report of the alert that fired (by initially providing some context in the first message), recalling any information/potential resolutions of previous occurrences stored in the knowledge base and also including other active alerts that may be related. One last step that can be asked to the assistant is to trigger an action, currently only sending the report (or any other message) to a preconfigured slack webhook is supported. ## Testing _Note: when asked to send a message to another connector (in our case slack), we'll try to include a link to the generated conversation. It is only possible to generate this link if [server.publicBaseUrl](https://www.elastic.co/guide/en/kibana/current/settings.html#server-publicBaseUrl) is correctly set in kibana settings._ - Create a slack webhook connector - Get slack webhook. I can share one and invite you to the workspace, or if you want to create one: - create personal workspace at https://slack.com/signin#workspaces - create an app for that workspace at https://api.slack.com/apps - under Features > OAuth & Permissions > Scopes > Bot Token Scopes, add `incoming-webhook` permission - install the app - webhook url is available under Features > Incoming Webhooks - Create a rule that can be triggered with available documents and attach observability AI assistant connector. (I use `Error Count Threshold` and generate errors via `node scripts/synthtrace many_errors.ts --live`) - configure the connector with one genai connector and a message with instructions. Example: ``` High error count alert has triggered. Execute the following steps: - create a graph of the error count for the service impacted by the alert for the last 24h - to help troubleshoot recall past occurrences of this alarm, also any other active alerts. Generate a report with all the found informations and send it to slack connector as a single message. Also include the link to this conversation in the report ``` - Track alert status and verify connector was executed. You should get a slack notification sent by the assistant, and a new conversation will be stored TODO - unit/integration tests - see #168369 for reference implementation - documentation --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Dario Gieselaar <dario.gieselaar@elastic.co>
## Summary In this PR: - Address @adcoelho comments regarding documentation. - Fix @js-jankisalvi bug about unsupported consumers (#168369 (review)). - Address @shanisagiv1 feedback regarding the title and the description. Specifically: - The title changed to "<rule_name> - Grouping by <grouping_by_value> (Auto-created)". - The description changed to "This case was created by the Case action in <rule_name_link>. The assigned alerts are grouped by <grouping_by_key>:<grouping_by_value>". - Add the grouping key as a tag. <img width="2289" alt="Screenshot 2024-04-13 at 4 41 36 PM" src="https://github.com/elastic/kibana/assets/7871006/63e17947-5f39-4437-820b-7c69f42bfbe3"> The issue about the "Unknown" user will be fixed in another PR. About @adcoelho bug: https://github.com/elastic/kibana/assets/7871006/c46aa7c4-9d1a-475b-9d07-6bdff3ef00c8 I think it is fine to leave it as it is because a) the value will not be saved even if they are added b) an error is being shown c) the only way to do it properly is to validate while the user is typing which is going to lead to bad UX. If you feel otherwise let me know. ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Summary
Depends on: #166267, #170326, #169484, #173740, #173763, #178068, #178307, #178600, #180437
PRs:
Fixes: #153837
Testing
Run Kibana with
--run-examples
if you want to use the "Always firing" rule.Create a rule with a case action in observability and the stack. The security solution is not supported. You should not be able to assign a case action in a security solution rule.
unknow
.Checklist
Delete any items that are not applicable to this PR.
For maintainers
Release notes
Automatically create cases when an alert is triggered.