-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RAC] Rule monitoring: Event Log for Rule Registry #98353
[RAC] Rule monitoring: Event Log for Rule Registry #98353
Conversation
e42dc9d
to
1bd2b91
Compare
f97fcd3
to
227f73f
Compare
Trying to understand a couple of things, including regarding the PR #98935 - RAC -Decouple registry from alerts-as-data client.
|
6269a30
to
093e238
Compare
@pmuellr @tsg Feels like this is worth adding as a discussion item to the next Alerts at Data sync.... I'm afraid of this getting lost in issue comments :) |
@gmmorris thanks for the ping. I added it to the agenda for May 11th. If we think it's blocking work, we could also meet ad-hoc earlier.
I think this is a fair question and I am also concerned about potentially diverging here (when/what should I write to eventlog vs execlog). The argument for the execlog that I heard is that by following the same naming convention as the alert indices, it is easier for the admin to give access to them to the users of the system, so they can monitor the state of their Rules directly. @spong @dgieselaar @pmuellr Other thoughts on pros/cons here? |
@tsg:
I agree here. I don't really have a strong opinion here, but my preference would be to not have three things (rule registry, execlog, event log). I'd prefer two, max. Not all of these things diverge enough in requirements to warrant maintaining three different implementations, IMHO. |
This is a curious one for me @tsg . This doesn't feel to me like an index that users need to access directly... do we see much value in them building visualisation over this? 🤔 |
YES - customers should be able to do all kinds of slicing/dicing of the event log data in viz's and discover. So we need to make that happen, and there are a number of technical issues with the event log shape we need to change to improve this (hopefully for 7.14). Longer-term, I'm hoping the "saved objects as data" effort will eventually just turn into "allow me to access an arbitrary index with Kibana RBAC if I set things up just right, probably as a KIP". Which we could then use as a way to provide viz's / discover support for the event log directly, for all customers, with RBAC support. May not end up providing direct ES query support with RBAC, but that's fine because I think viz's and discover get us most of the use cases. In the meantime, we could look at providing a new built-in role that would allow read-only access to the event log, for admin-type folks, to work around having to make the customer do that themselves. And some ease-of-use to add a KIP for it, etc. I'm just not sure, if the main difference between the execlog and event log is that the execlog is per-space and event log is global, that we need to deal with that. We still need to handle the case, presumably, of spaces with multiple solutions that you want to have silo'd from users, which puts us more in a execlog per namespace/consumer vs just namespace, and I'm not happy about the potential large number of all these indices ... |
Pinging @elastic/apm-ui (Team:apm) |
+100. I really think we need a single source of truth, that is a living document, for the architecture that is being proposed and how these Elasticsearch indices will be used. |
e5b55fe
to
f9ee1cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting this error when starting Kibana in dev:
server log [12:34:33.909] [error][IndexManagementGateway][eventLog][plugins][ruleRegistry] ResponseError: security_exception
at onBody (/Users/smith/Code/kibana-review/node_modules/@elastic/elasticsearch/lib/Transport.js:337:23)
at IncomingMessage.onEnd (/Users/smith/Code/kibana-review/node_modules/@elastic/elasticsearch/lib/Transport.js:264:11)
at IncomingMessage.emit (events.js:388:22)
at endReadableNT (internal/streams/readable.js:1336:12)
at processTicksAndRejections (internal/process/task_queues.js:82:21) {
meta: {
body: { error: [Object], status: 403 },
statusCode: 403,
headers: {
'content-length': '577',
'content-type': 'application/json;charset=utf-8',
'x-cloud-request-id': 'QuceUv6KRh6-zCcLjQn7ng',
'x-found-handling-cluster': 'b5caf8c576704714a9bb2559bddab987',
'x-found-handling-instance': 'instance-0000000028',
date: 'Tue, 25 May 2021 17:34:33 GMT'
},
meta: {
context: null,
request: [Object],
name: 'elasticsearch-js',
connection: [Object],
attempts: 0,
aborted: false
}
}
}
server log [12:34:33.914] [error][IndexManagementGateway][eventLog][plugins][ruleRegistry] {
"body": {
"error": {
"root_cause": [
{
"type": "security_exception",
"reason": "action [indices:admin/create] is unauthorized for user [kibana_system_user] with roles [kibana_system] on indices [.alerts-security.alerts-default-000001], this action is granted by the index privileges [create_index,manage,all]"
}
],
"type": "security_exception",
"reason": "action [indices:admin/create] is unauthorized for user [kibana_system_user] with roles [kibana_system] on indices [.alerts-security.alerts-default-000001], this action is granted by the index privileges [create_index,manage,all]"
},
"status": 403
},
"statusCode": 403,
"headers": {
"content-length": "577",
"content-type": "application/json;charset=utf-8",
"x-cloud-request-id": "QuceUv6KRh6-zCcLjQn7ng",
"x-found-handling-cluster": "b5caf8c576704714a9bb2559bddab987",
"x-found-handling-instance": "instance-0000000028",
"date": "Tue, 25 May 2021 17:34:33 GMT"
},
"meta": {
"context": null,
"request": {
"params": {
"method": "PUT",
"path": "/.alerts-security.alerts-default-000001",
"body": "{\"aliases\":{\".alerts-security.alerts-default\":{\"is_write_index\":true}}}",
"querystring": "",
"headers": {
"user-agent": "elasticsearch-js/8.0.0-canary.4 (darwin 20.3.0-x64; Node.js v14.17.0)",
"x-elastic-product-origin": "kibana",
"x-elastic-client-meta": "es=8.0.0p,js=14.17.0,t=8.0.0p,hc=14.17.0",
"content-type": "application/json",
"content-length": "71"
},
"timeout": 30000
},
"options": {},
"id": 66
},
"name": "elasticsearch-js",
"connection": {
"url": "https://b5caf8c576704714a9bb2559bddab987.us-east-1.aws.staging.foundit.no:9243/",
"id": "https://b5caf8c576704714a9bb2559bddab987.us-east-1.aws.staging.foundit.no:9243/",
"headers": {},
"deadCount": 0,
"resurrectTimeout": 0,
"_openRequests": 0,
"status": "alive",
"roles": {
"master": true,
"data": true,
"ingest": true,
"ml": false
}
},
"attempts": 0,
"aborted": false
}
}
server log [12:34:33.914] [error][IndexBootstrapper][eventLog][plugins][ruleRegistry] error bootstrapping elasticsearch resources: Error creating initial index: security_exception
server log [12:34:33.914] [error][plugins][ruleRegistry] Error: Event log bootstrapping failed, logName="security.alerts"
at EventLogProvider.getLog (/Users/smith/Code/kibana-review/x-pack/plugins/rule_registry/server/event_log/log/event_log_provider.ts:68:15)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (internal/process/task_queues.js:95:5)
at testImplementation (/Users/smith/Code/kibana-review/x-pack/plugins/rule_registry/server/event_log/test_implementation.ts:72:3)
at /Users/smith/Code/kibana-review/x-pack/plugins/rule_registry/server/event_log/test_implementation.ts:42:7
I'm connected to the edge observability cluster so it might have something to do with pre-existing configuration there.
Setting xpack.ruleRegistry.index
in kibana.dev.yml works, but I'm not sure what's going on here and if this is something that needs to be fixed in this PR so let me know.
this.pagination = { page: 1, perPage: 20 }; | ||
} | ||
|
||
public filterByLogger(loggerName: string): IEventQueryBuilder<TEvent> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enjoy using these kind of builder pattern things, but it seems implementing it as part of this PR in this particular plugin adds a lot more than is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just decoupled this builder from .search()
, so now, if you don't need it or would prefer not using it, you can call .search()
directly without messing up with the builder:
export interface IEventLog<TEvent> extends IEventLoggerTemplate<TEvent> {
getNames(): IndexNames;
getQueryBuilder(): IEventQueryBuilder<TEvent>;
search<TDocument = TEvent>(
request: estypes.SearchRequest
): Promise<estypes.SearchResponse<TDocument>>;
}
}); | ||
|
||
// TODO: remove before merge | ||
testEventLogImplementation(this.eventLogService, this.logger); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the intent of this is. Could it be written instead as an integration test that we can keep around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent is to remove this code before merging the PR, just like the comment says :)
We will definitely need to cover this plugin with tests but this is out of scope of this PR. I think we could do it after converging the 2 implementations.
f9ee1cf
to
d574302
Compare
It seems you've caught an error during execution of
This pre-built elasticsearch:
username: 'kibana_system'
password: 'changeme'
hosts: 'http://localhost:9200' Otherwise you'd need to give your user in the remote cluster the required privileges manually: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you for the changes!
I left a super minor comment regarding a leftover module, but it is certainly not a blocker. Giving my approval.
signal(value: T): void; | ||
} | ||
|
||
export function createReadySignal<T>(): ReadySignal<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not needed anymore after the latest changes?
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
Public APIs missing exports
Unknown metric groupsAPI count
References to deprecated APIs
History
To update your PR or re-run it, just comment with: cc @banderror |
**Needed for:** rule execution log for Security elastic#94143 **Related to:** - alerts-as-data: elastic#93728, elastic#93729, elastic#93730 - RFC for index naming elastic#98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see elastic#98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (elastic#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of elastic#98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (elastic#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
Needed for: rule execution log for Security #94143
Related to:
Summary
This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the
rule_registry
plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations likeevent_log
plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one inrule_registry
itself.In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments.
Done in this PR:
As for reviewing this PR, perhaps it might be easier to start with:
Next steps
Next steps towards rule execution log in Security (#94143):
Next steps in the context of RAC and unified implementation:
RuleDataService
implementationChecklist
Delete any items that are not applicable to this PR.