[Discuss] [Security Solution] [Alerting] HTTP route RFC for unified rule management #95060

dhurley14 · 2021-03-22T14:14:10Z

Introduction

RFC to define the needs of the security solution for a unified rule management API. Not sure if something like this exists elsewhere but figured I'd throw this together.

Security routes to be added to alerting framework

We make use of bulk actions #53144 such as bulk create, bulk update (bulk enable essentially), bulk delete.. so we need some new routes to handle these operations. These routes also require creating additional apis on the alerts client to enable bulk operations on the saved objects which currently (as of 7.12 I believe) have all of these operations available except for bulk delete.

We also need import export ~~but it looks like @ymao1 has been working away at that which is great :) #94151 and #94152~~

Privileges and index bootstrap routes are used for establishing the .siem-signals index which will be the unified alerting index. I think Garrett is handling this though but if anyone has additional comments those would be welcome.

Prepackaged rules route - Not sure how this will look in 7.13 after this lands

Find rules status route is undocumented but maybe should be documented? Also do we want this in the security solution routes or in alerting? My feeling is it belongs in the security solution but I could see utility in having it be available for other solutions too. This could also exist in the get framework health route from alerting.

We also need a route for updating tags but I'm not sure if tags are security-specific so this route could remain in the security solution.

Lastly, new routes for adding actions and enabling / disabling our pre-built rules (tagged with __immutable__) Also add validation to prevent updating immutable rules too.

Route validation

Currently alerting is utilizing kbn config for route validation. If possible, it would be great if the rule management api and the solutions utilized the same validation library. My suggestion would be to introduce io-ts in the newer apis like import / export, bulk etc. and then gradually update the other routes to utilize it as well.

Open questions

We currently merge the rule status saved object with the rule data in our "find_rules" route. Hopefully this status saved object will be removed in 7.13 and be replaced with events in the event log plugin. Regardless, the _find route returns the rule with some current status information merged into the response. Either we copy this functionality over into the _find route of alerting or we tell the UI to query the rule health / rule status api. The easier option is to have the same functionality as currently exists in the security solution route.

Related issues from alerting: #90375, #90381, #90379

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-03-22T14:14:43Z

Pinging @elastic/security-detections-response (Team:Detections and Resp)

ymao1 · 2021-03-22T14:34:29Z

The initial implementation of import/export for the alerting framework will be done via the saved object management UI, so I'm not sure it will exactly meet your needs in this iteration. We are addressing the need for importing/exporting rules and connectors in order to support the deprecation of legacy multi-tenancy, so the expected workflow will be that a user will use the saved object management UI or the saved object import/export routes/APIs. In the next step, we would expect to provide routes/APIs in the alerting framework for import/export that provide an RBAC layer on top of the saved objects API but there is no issue open for that right now.

elasticmachine · 2021-03-22T17:03:38Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

banderror · 2021-03-22T18:45:34Z

Just for reference, these are our current endpoints and route handlers:

I think one thing to note is that our routes are not simple wrappers around AlertsClient and data specific to only alerting. They contain pieces of code specific to the Detections domain, for example:

ML

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/routes/rules/create_rules_route.ts

Line 72 in 041566d

const mlAuthz = buildMlAuthz({
Checks for existence of indices specified in rule params

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/routes/rules/create_rules_bulk_route.ts

Line 95 in 041566d

const indexExists = await getIndexExists(esClient.asCurrentUser, finalIndex);
Updating our custom saved objects representing "throttled" rule actions

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/routes/rules/create_rules_bulk_route.ts

Line 111 in 041566d

const ruleActions = await updateRulesNotifications({
Creating exceptions list for Elastic Endpoint rule

kibana/x-pack/plugins/security_solution/server/lib/detection_engine/routes/rules/create_rules_route.ts

Line 92 in 041566d

await context.lists?.getExceptionListClient().createEndpointList();
etc etc

So the big question imho is how are we going to separate responsibilities and design the final framework (whatever it is). Is it going to be only Alerting, or Alerting + ETL on top, or Alerting + unified Detection Engine, or... ? That said, if/when this is more clarified, would be definitely good to consolidate HTTP APIs for rules, but at this point it seems like too early to make any decisions on routes.

Maybe it would be worth going through all our routes and figure out all the generic (alerting) and specific (detections) parts.

Another idea is let's start with adding some missing APIs to alerting (methods to AlertsClient etc) that we've been needing for a long time. This is a meta ticket.

I think bulk CRUD methods would be a good thing to add because of 2 reasons:

There's a lot of bulk CRUD code in our API routes, and sometimes it's implemented like await Promise.all() or something like that which doesn't scale with the number of items to process, and we already had issues with that. We would benefit from extracting this mechanics to Alerting and implementing it properly.
We will need bulk CRUD stuff after my efforts with Event Log are done -- in order to optimize our rules/_find and rules/_find_statuses endpoints.

Lastly, I was thinking about leveraging data plugin for creating an RPC infrastructure -- just like they do it in Threat Hunting. They can add a server-side function and call it from the client app -- without introducing new HTTP endpoints. HTTP endpoints as they are now are public (I'd say even "published"), meaning that when we add a new endpoint we agree on maintaining it and avoiding breaking changes untill the next major. This is hard, and would be good to have an option to bypass this in those situations where an app needs a new app-specific "endpoint", but it's not generic/useful enough to publish it for all users of Kibana. This is not super related to this RFC, but I'd still like to mention this idea here.

ymao1 · 2021-03-29T14:08:41Z

Addressing the immediate ask of bulk rule actions:

Bulk create/delete - Currently single create and delete functions will throw an error if any step of the create/delete process fails. Assume the bulk version of these functions should not throw an error if any one operation fails, rather it should return an array of responses, either success or errors with reason?
Bulk get - This was not mentioned in the issue description - is that because it's not needed or because it's a lower priority?
Bulk update - This was only mentioned in the context of bulk enabling/disabling in the issue description. Is this because we are considering rules to be immutable and therefore will never be updating them? Is the ask then that there's a bulkEnable and bulkDisable function on the alerts client? Note that currently, enabling/disabling a rule updates the updatedBy and updatedAt fields for a rule. Does that violate rule immutability?

Agree with @banderror that there is a lot of security specific functionality in the security routes outside of purely calling the alerts client so providing these bulk actions will not immediately remove the need for the security routes since there are additional. Out of the additional actions that are performed before and after calling the alerts client, do we know which ones might be needed by observability as well? That might help guide the alerting framework as to what might belong in the alert client.

ymao1 · 2021-03-29T14:24:35Z

Find rules status route is undocumented but maybe should be documented? Also do we want this in the security solution routes or in alerting? My feeling is it belongs in the security solution but I could see utility in having it be available for other solutions too. This could also exist in the get framework health route from alerting.

Are you suggesting that the rules with error status should be returned as part of the /api/alerting/_health endpoint in addition to whether there are any rules in an error state?

Lastly, new routes for adding actions

Is this route requested because security creates a separate rule for notifications instead of including the actions as part of rule creation? What would be the requirements of this new route?

gmmorris · 2021-03-31T11:43:11Z

Find rules status route is undocumented but maybe should be documented?

I believe this omission is intentional - this is an API we consider internal, open to change and hence not on the public docs.
While we do want other teams in Elastic to use it, as we can work closely with them when a change is needed, we don't want this on external docs.
@mikecote can confirm whether I'm right about that. 😬

The Platform leads are in the process of formalising clearer guidance around the distinction between internal/external apis and what manner of documentation we should provide for each. Hopefully this'll be clearer once we finish that work 😄 . (cc @kobelb )

ymao1 · 2021-03-31T11:50:26Z

Find rules status route is undocumented but maybe should be documented?

I believe this omission is intentional - this is an API we consider internal, open to change and hence not on the public docs.

I thought this was in reference to the find rule status route inside security solutions and whether this should be lifted up into the alerting framework?

gmmorris · 2021-03-31T12:02:34Z

Find rules status route is undocumented but maybe should be documented?

I believe this omission is intentional - this is an API we consider internal, open to change and hence not on the public docs.

I thought this was in reference to the find rule status route inside security solutions and whether this should be lifted up into the alerting framework?

Oh, sorry, probably my bad. 😬
Ignore me 😅

dontcallmesherryli · 2021-04-14T17:39:47Z

@oatkiller assigning this to you to start grooming/planning for this ticket.

dhurley14 added the Team:Detections and Resp Security Detection Response Team label Mar 22, 2021

dhurley14 added the discuss label Mar 22, 2021

pmuellr added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Mar 22, 2021

pmuellr added the Feature:Alerting label Mar 22, 2021

YulNaumenko assigned YulNaumenko and unassigned YulNaumenko Mar 25, 2021

ymao1 self-assigned this Mar 25, 2021

dhurley14 added the v8.0.0 label Mar 29, 2021

ymao1 removed their assignment Apr 1, 2021

dontcallmesherryli added the dependencies Pull requests that update a dependency file label Apr 1, 2021

dontcallmesherryli added Theme: rac label obsolete Feature:Detection Rules Anything related to Security Solution's Detection Rules labels Apr 14, 2021

dontcallmesherryli assigned oatkiller Apr 14, 2021

peluja1012 added the Team:Detection Rule Management Security Detection Rule Management Team label Sep 15, 2021

peluja1012 added the technical debt Improvement of the software architecture and operational architecture label Oct 26, 2021

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discuss] [Security Solution] [Alerting] HTTP route RFC for unified rule management #95060

[Discuss] [Security Solution] [Alerting] HTTP route RFC for unified rule management #95060

dhurley14 commented Mar 22, 2021 •

edited

Loading

elasticmachine commented Mar 22, 2021

ymao1 commented Mar 22, 2021

elasticmachine commented Mar 22, 2021

banderror commented Mar 22, 2021

ymao1 commented Mar 29, 2021

ymao1 commented Mar 29, 2021

gmmorris commented Mar 31, 2021

ymao1 commented Mar 31, 2021

gmmorris commented Mar 31, 2021

dontcallmesherryli commented Apr 14, 2021

[Discuss] [Security Solution] [Alerting] HTTP route RFC for unified rule management #95060

[Discuss] [Security Solution] [Alerting] HTTP route RFC for unified rule management #95060

Comments

dhurley14 commented Mar 22, 2021 • edited Loading

Introduction

Security routes to be added to alerting framework

Route validation

Open questions

elasticmachine commented Mar 22, 2021

ymao1 commented Mar 22, 2021

elasticmachine commented Mar 22, 2021

banderror commented Mar 22, 2021

ymao1 commented Mar 29, 2021

ymao1 commented Mar 29, 2021

gmmorris commented Mar 31, 2021

ymao1 commented Mar 31, 2021

gmmorris commented Mar 31, 2021

dontcallmesherryli commented Apr 14, 2021

dhurley14 commented Mar 22, 2021 •

edited

Loading