Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EventLog] Populate alert instances view with event log data #68437

Merged
merged 4 commits into from
Aug 14, 2020

Conversation

pmuellr
Copy link
Member

@pmuellr pmuellr commented Jun 5, 2020

resolves #57446

Summary

Adds a new API (AlertClient and HTTP endpoint) getAlertStatus() which returns
data calculated from the event log.

The data returned in this PR is fairly minimal - just enough to replace the
current instance details view data. In the future, we can add the following
sorts of things:

  • alert execution errors
  • counts of alert execution
  • sequences of active instance time spans
  • if we can get the alert SO into the action execution, we could also
    provide action execution errors

Checklist

Delete any items that are not applicable to this PR.

@pmuellr pmuellr force-pushed the alerting/details-via-el-2 branch from 2ceae3f to 46257d9 Compare June 9, 2020 13:24
@pmuellr pmuellr changed the title [Alerting] enhance alert instances view with event log data [EventLog] Populate alert instances view with event log data Jun 9, 2020
@pmuellr pmuellr force-pushed the alerting/details-via-el-2 branch 3 times, most recently from ab786db to bac3a8a Compare June 16, 2020 16:24
@pmuellr pmuellr force-pushed the alerting/details-via-el-2 branch 3 times, most recently from 0a8bd81 to 70570ef Compare June 25, 2020 21:04
@pmuellr pmuellr force-pushed the alerting/details-via-el-2 branch 2 times, most recently from df65c14 to c2dd065 Compare June 30, 2020 14:12
@pmuellr pmuellr force-pushed the alerting/details-via-el-2 branch 4 times, most recently from a32dda4 to acb9f42 Compare July 23, 2020 00:55
@pmuellr pmuellr force-pushed the alerting/details-via-el-2 branch 2 times, most recently from caa38d0 to 2e04a44 Compare August 5, 2020 00:00
resolves elastic#57446

Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns
data calculated from the event log.

The data returned in this PR is fairly minimal - just enough to replace the
current instance details view data.  In the future, we can add the following
sorts of things:

- alert execution errors
- counts of alert execution
- sequences of active instance time spans
- if we can get the alert SO into the action execution, we could also
  provide action execution errors
@pmuellr pmuellr marked this pull request as ready for review August 10, 2020 15:34
@pmuellr pmuellr requested review from a team as code owners August 10, 2020 15:34
@mikecote mikecote self-requested a review August 10, 2020 17:50
Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This is great, the activity information is much better and precise. I added a question and a bunch of nits that can be skipped.


|Property|Description|Type|
|---|---|---|
|id|The id of the alert whose state you're trying to get.|string|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status or state?

Suggested change
|id|The id of the alert whose state you're trying to get.|string|
|id|The id of the alert whose status you're trying to get.|string|

@@ -269,6 +284,72 @@ export class AlertsClient {
}
}

public async getAlertStatus({ id, dateStart }: GetAlertStatusParams): Promise<AlertStatus> {
const eventLogClient = await this.getEventLogClient();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: This could be move below authorization checks to avoid loading when authorization fails.

public async getAlertStatus({ id, dateStart }: GetAlertStatusParams): Promise<AlertStatus> {
const eventLogClient = await this.getEventLogClient();

this.logger.debug('getAlertStatus(): getting the alert');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more context

Suggested change
this.logger.debug('getAlertStatus(): getting the alert');
this.logger.debug(`getAlertStatus(): getting alert ${id}`);

const defaultDateStart = new Date(dateNow.valueOf() - durationMillis);
const parsedDateStart = parseDate(dateStart, 'dateStart', defaultDateStart);

this.logger.debug('getAlertStatus(): search the event log');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more context

Suggested change
this.logger.debug('getAlertStatus(): search the event log');
this.logger.debug(`getAlertStatus(): search the event log for alert ${id}`);

});
events = queryResults.data;
} catch (err) {
this.logger.debug(`alertsClient.getAlertStatus(): error searching event log: ${err.message}`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more context

Suggested change
this.logger.debug(`alertsClient.getAlertStatus(): error searching event log: ${err.message}`);
this.logger.debug(`alertsClient.getAlertStatus(): error searching event log: ${err.message} for alert ${id}`);

dateEnd: dateNow.toISOString(),
});

function parseDate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: Function can be moved out of getAlertStatus to avoid being recreated on every call.

Comment on lines +292 to +296
await this.authorization.ensureAuthorized(
alert.alertTypeId,
alert.consumer,
ReadOperations.GetAlertStatus
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will defer to @gmmorris to review the access control bits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the bits from getAlertState for getAlertStatus, creating a new operation, etc. Hopefully I got everything - took a bit to find all the places this got hit, including something in security for this new operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right, but that's just from a cursory look at the code.
I'll try running it soon, just trying to get something done locally before I can pull it down.

Comment on lines 358 to 359
const { eventLogger, alertId, namespace } = params;
const { currentAlertInstanceIds, originalAlertInstanceIds } = params;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these two lines can merge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect I did this on purpose - a split like this seems silly, but ends up resulting in 2 lines of code - the merged version requires the props to be split one-per-line ala prettier, and so turns in to 7 lines. :-)

But if it's odd enough for someone to comment on, worth paying the extra 5 lines for it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't think of that! :-)

try {
const queryResults = await eventLogClient.findEventsBySavedObject('alert', id, {
page: 1,
per_page: 10000,
Copy link
Contributor

@mikecote mikecote Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should open a tech debt issue to expand this limitation?

I'm sure you've put more thought than I have. I'm wondering how likely this number would be reached on alerts that run frequently or have a massive number of instances? Ex: 100 always firing instances * 100 executions would limit the events returned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a funky API, since you can only provide a start date and not an end date, since the end date is always NOW because the data returned also takes into current data (eg, muted instance ids).

I'm not opposed to creating a tech debt issue, especially since it's possible that we could "miss" some data. I guess the idea would be that we should page through the events till we get to the start date. I just opened issue #74860 for this.

Copy link
Contributor

@jportner jportner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review only -- Security changes in x-pack/plugins/security/server/authorization/privileges/feature_privilege_builder/alerting.ts and x-pack/plugins/security/server/authorization/privileges/feature_privilege_builder/alerting.test.ts LGTM!

Copy link
Contributor

@YulNaumenko YulNaumenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pmuellr pmuellr added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Aug 12, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

@kbn/optimizer bundle module count

id value diff baseline
alerts 26 +1 25

async chunks size

id value diff baseline
triggers_actions_ui 736.6KB +882.0B 735.7KB

page load bundle size

id value diff baseline
alerts 89.1KB +602.0B 88.5KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@pmuellr pmuellr merged commit 67e28ac into elastic:master Aug 14, 2020
pmuellr added a commit to pmuellr/kibana that referenced this pull request Aug 14, 2020
…#68437)

resolves elastic#57446

Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns
alert data calculated from the event log.
gmmorris added a commit to gmmorris/kibana that referenced this pull request Aug 14, 2020
* master:
  Fix bug on TopN weird behavior with zero values (elastic#74942)
  [Lens] Fix table sorting bug (elastic#74902)
  [SECURITY_SOLUTION] Retry on ingest setup (elastic#75000)
  [file upload] lazy load to reduce page load size (elastic#74967)
  Drilldowns for TSVB / Vega / Timelion (elastic#74848)
  [EventLog] Populate alert instances view with event log data (elastic#68437)
  [UiActions] pass trigger into action execution context (elastic#74363)
mikecote pushed a commit that referenced this pull request Aug 14, 2020
…#75036)

resolves #57446

Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns
alert data calculated from the event log.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alert instances view to display data from event log instead of current instances
7 participants