Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research approach for pre-defined rule IDs when making rules share-capable #100667

Closed
mikecote opened this issue May 26, 2021 · 17 comments
Closed
Assignees
Labels
estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting research Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

Research issue for #100067 regarding pre-defined rule IDs.

In 8.0, we plan to run a migration that removes the namespace from the saved-object serialized ID. To do so, new rule and connector IDs will be generated. With this issue, we should research the impacts of rules that rely on pre-defined IDs, how the migration will work when there are conflicts, and how de-duplication will be handled with pre-packaged rules. If necessary, work with the @elastic/kibana-security team if we need anything further from them to support this.

@mikecote mikecote added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) research labels May 26, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@mikecote
Copy link
Contributor Author

cc @spong

@gmmorris gmmorris added the Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework label Jul 1, 2021
@gmmorris gmmorris added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 6, 2021
@mikecote
Copy link
Contributor Author

Marking as blocked until the research in #100067 confirms this is still necessary and with what requirements.

@ymao1
Copy link
Contributor

ymao1 commented Jul 26, 2021

Detection rules assign a params.ruleId value which defaults to the _id value if defined but is auto-generated if _id is not defined.

Example security solution rule SO:
{
	"_index": ".kibana_8.0.0_001",
	"_id": "alert:202af230-e96b-11eb-a858-c5aee1479d55",
	"_score": 0.0,
	"_source": {
		"alert": {
			"name": "test",
			"tags": [
				"__internal_rule_id:44569889-d223-49c3-b33d-2bd0527f5ee6",
				"__internal_immutable:false"
			],
			"alertTypeId": "siem.signals",
			"consumer": "siem",
			"params": {
				"author": [],
				"description": "test",
				"falsePositives": [],
				"from": "now-75s",
				"license": "",
				"outputIndex": ".siem-signals-default",
				"meta": {
					"from": "1m",
					"kibana_siem_app_url": "https://localhost:5601/app/security"
				},
				"maxSignals": 100,
				"riskScore": 21,
				"riskScoreMapping": [],
				"severity": "low",
				"severityMapping": [],
				"threat": [],
				"to": "now",
				"references": [],
				"version": 2,
				"exceptionsList": [{
					"id": "3e0ec560-e96b-11eb-a858-c5aee1479d55",
					"list_id": "4fc01463-730f-4623-9ab4-09b762aa8469",
					"type": "detection",
					"namespace_type": "single"
				}],
				"ruleId": "44569889-d223-49c3-b33d-2bd0527f5ee6",
				"immutable": false,
				"query": "system.cpu.total.norm.pct > 0.2",
				"language": "kuery",
				"filters": [],
				"index": [
					"apm-*-transaction*",
					"traces-apm*",
					"auditbeat-*",
					"endgame-*",
					"filebeat-*",
					"logs-*",
					"packetbeat-*",
					"winlogbeat-*",
					"es-apm*"
				],
				"type": "query"
			},
			"schedule": {
				"interval": "15s"
			},
			"enabled": true,
			"actions": [],
			"throttle": null,
			"notifyWhen": "onActiveAlert",
			"apiKeyOwner": "elastic",
			"apiKey": "hV5CssnHuvmu+BPdK1nd33n68tD1ZJBJy/ipdxLpJxKho9Tmx3Foc8BRquyxd55tjLRdd9hG6b060v86IJGwW17p8yUf9QFnSIQqF4gsIv1IymUjFHc+eP/t5wBho3nJyqSLTQpX2zhiAHp+s/y5rT3wH2EeHmpAdeHHK/+BuaHunptwuisaypSXW6eEie3ww0LNANeoH1Jn/w==",
			"createdBy": "elastic",
			"updatedBy": "elastic",
			"createdAt": "2021-07-20T14:59:48.591Z",
			"updatedAt": "2021-07-20T15:00:39.717Z",
			"muteAll": false,
			"mutedInstanceIds": [],
			"executionStatus": {
				"status": "ok",
				"lastExecutionDate": "2021-07-20T15:41:17.191Z",
				"error": null
			},
			"meta": {
				"versionApiKeyLastmodified": "8.0.0"
			},
			"scheduledTaskId": "2160b310-e96b-11eb-a858-c5aee1479d55"
		},
		"type": "alert",
		"references": [],
		"migrationVersion": {
			"alert": "7.13.0"
		},
		"coreMigrationVersion": "8.0.0",
		"updated_at": "2021-07-20T15:41:18.965Z"
	}
}

This is also stored as a tag with the format __internal_rule_id:${params.ruleId}

Currently, it looks like rule_id is not pointing to any SO objects directly but is used internally in security for de-duplication. However, from @FrankHassanabad: the notion and idea of having a "stable id" that never ever ever changes over time is incredibly important and this id can never be re-autogenerated. It is forever the same UUID until the end of time.

There is an open security issue for investigating updating _id to be the static rule_id. This would allow security rules to be imported/exported from the Saved Objects Management UI instead of having custom export logic.

We need to investigate the implications of making alerting rules share capable with and without #99741. Should we even be doing #99741 if making rules share-capable will re-generate all the ids in 8.0?

@ymao1 ymao1 removed the blocked label Jul 28, 2021
@ymao1 ymao1 self-assigned this Jul 28, 2021
@ymao1
Copy link
Contributor

ymao1 commented Jul 29, 2021

Rules with pre-defined IDs can be created one of two ways:

  1. In a Kibana plugin via direct access to the rules client create function by specifing an id parameter in the options
  2. Using the Alerting API by specifying an id parameter in the request

Reviewing the internal usages of both, there are currently no plugins that are creating rules with a pre-determined rule id. This leave potential users that are using the API to create rules with a pre-determined id. Testing with the POC for making rules share-capable shows that no additional changes will be required for these rules on top of the changes already needed to get alerting and actions working with share-capable SOs

Steps to Test

  1. Ran older version of Kibana and created a rule in a non-default space with a predefined ID using the alerting API (used ID 2e580225-2a58-48ef-938b-572933be06fe
  2. Ran a branch off of master that updated the alert SO to multiple-isolated with no other changes
  3. Ran POC branch that updated the alert SO to multiple-isolated and updated alerting code to use .resolve() function to get saved objects.

Also tested this when you have a rule with predefined id in two different spaces. The rules are each migrated to a new rule with a re-generated ID but using the previously predefined id resolves correctly.

@ymao1
Copy link
Contributor

ymao1 commented Jul 29, 2021

With #99741, security solutions is investigating the possibility of using a migration to use the static rule id as the rule saved object id. Assuming they are successful with this migration, what would be the implications of changing the alert SO to share-capable?

Current, security rules assign a separate rule_id field to each detection rule. This is generated here

export const convertCreateAPIToInternalSchema = (
input: CreateRulesSchema,
siemClient: AppClient
): InternalRuleCreate => {
const typeSpecificParams = typeSpecificSnakeToCamel(input);
const newRuleId = input.rule_id ?? uuid.v4();
when rules are created. Prepackaged detection rules have a pre-populated rule_id field.

Example original pre-packaged security rule as a saved object (relevant fields only)
Note that the auto-generated id does not match the static id

{
    "_id" : "yings-other-space:alert:c0e889b0-f07f-11eb-af8d-c12b8253c148", <-- auto-generated SO ID
    "_source" : {
        "alert": {
            "name" : "Access of Stored Browser Credentials",
            "tags" : [
                "Elastic",
                "Host",
                "macOS",
                "Threat Detection",
                "Credential Access",
                "__internal_rule_id:20457e4f-d1de-4b92-ae69-142e27a4342a", <-- static rule id
                "__internal_immutable:true"
            ],
            "alertTypeId" : "siem.signals",
            "params" : {
                "ruleId" : "20457e4f-d1de-4b92-ae69-142e27a4342a", <-- static rule id
            }
        },
        "type" : "alert",
        "references" : [ ],
        "namespace" : "yings-other-space",
        "migrationVersion" : {
                "alert" : "7.13.0"
        },
        "coreMigrationVersion" : "7.14.0",
        "updated_at" : "2021-07-29T15:15:05.403Z"
}

If #99741 is successfully implemented, the id of the rule will be updated to match the static rule_id

{
    "_id" : "yings-other-space:alert:20457e4f-d1de-4b92-ae69-142e27a4342a", <-- migrated SO ID
    "_source" : {
        "alert": {
            "name" : "Access of Stored Browser Credentials",
            "tags" : [
                "Elastic",
                "Host",
                "macOS",
                "Threat Detection",
                "Credential Access",
                "__internal_rule_id:20457e4f-d1de-4b92-ae69-142e27a4342a", <-- static rule id
                "__internal_immutable:true"
            ],
            "alertTypeId" : "siem.signals",
            "params" : {
                "ruleId" : "20457e4f-d1de-4b92-ae69-142e27a4342a", <-- static rule id
            }
        },
        "type" : "alert",
        "references" : [ ],
        "namespace" : "yings-other-space",
        "migrationVersion" : {
                "alert" : "7.13.0"
        },
        "coreMigrationVersion" : "7.14.0",
        "updated_at" : "2021-07-29T15:15:05.403Z"
}

This would be the point where we could include security rules in SO rule export and security could remove their custom import/export

But then, when alert SOs are made share-capable, the rule id will be regenerated and no longer match the static rule_id, which would undo the conditions required for security rules to be exportable 😱

{
    "_id" : "alert:99eb200e-a545-50c6-9930-c4a67e6bec3a",, <-- regenerated SO ID
    "_source" : {
        "alert": {
            "name" : "Access of Stored Browser Credentials",
            "tags" : [
                "Elastic",
                "Host",
                "macOS",
                "Threat Detection",
                "Credential Access",
                "__internal_rule_id:20457e4f-d1de-4b92-ae69-142e27a4342a", <-- static rule id
                "__internal_immutable:true"
            ],
            "alertTypeId" : "siem.signals",
            "params" : {
                "ruleId" : "20457e4f-d1de-4b92-ae69-142e27a4342a", <-- static rule id
            }
        },
        "type" : "alert",
        "references" : [ ],
        "namespaces" : [
                "yings-other-space"
        ],
        "migrationVersion" : {
                "alert" : "8.0.0"
        },
        "coreMigrationVersion" : "8.0.0",
        "updated_at" : "2021-07-29T15:15:05.403Z"
}

@ymao1
Copy link
Contributor

ymao1 commented Jul 29, 2021

In the previous comment, I assumed that the proposed security solutions migration and the migration to share-capable alert would be done in sequence:
7.15/7.15 - migrate detection rules to use static rule id as SO id
8.0 - migrate all alert SOs to share-capable

I then tried combining the migrations, so that they would be done all as part of the 8.0.0 migration. This means converting the alert SO to share capable and adding this to the alerting migrations:

const migrationPrebuiltSecurityRules = createEsoMigration(
    encryptedSavedObjects,
    (doc): doc is SavedObjectUnsanitizedDoc<RawAlert> => isSecuritySolutionRule(doc),
    pipeMigrations(updateIdForSecurityRules)
);

'8.0.0': executeMigrationWithErrorHandling(migrationPrebuiltSecurityRules, '8.0.0'),
function updateIdForSecurityRules(
  doc: SavedObjectUnsanitizedDoc<RawAlert>
): SavedObjectUnsanitizedDoc<RawAlert> {
  const {
    id,
    attributes: { tags },
  } = doc;

  let newRuleId = id;
  const ruleTags = tags ?? [];
  const ruleStaticId = ruleTags.find((tag: string) => tag.startsWith('__internal_rule_id:'));
  if (ruleStaticId) {
    newRuleId = ruleStaticId.replace('__internal_rule_id:', '');
  }

  return {
    ...doc,
    id: newRuleId,
    attributes: {
      ...doc.attributes,
    },
  };
}

The output of this was that the alert SO was made share-capable AND the id was updated to the static id. In addition, prior to the migration, I had loaded prebuilt rules in 2 different spaces (default and a custom space). This caused duplicate rules to be created for each space. After this combined 8.0 migration, only the rules in the default space remain. From the alerting perspective, this seems like desirable behavior because we're reducing the load on Task Manager by not running duplicate prebuilt rules but I would be interested in knowing if this is desirable from security's perspective.

@ymao1
Copy link
Contributor

ymao1 commented Jul 30, 2021

The output of this was that the alert SO was made share-capable AND the id was updated to the static id. In addition, prior to the migration, I had loaded prebuilt rules in 2 different spaces (default and a custom space). This caused duplicate rules to be created for each space. After this combined 8.0 migration, only the rules in the default space remain. From the alerting perspective, this seems like desirable behavior because we're reducing the load on Task Manager by not running duplicate prebuilt rules but I would be interested in knowing if this is desirable from security's perspective.

I also tried loading pre-built rules into 2 different non-default spaces and migrating as described and post-migration, only one set of rules remained. However, only one of the name spaces was preserved with the rule, meaning where previously I had rule with ids yings-space:alert:9f47c8de-f133-11eb-8d51-57dbc7220f98 and my-space-1:alert:486b74d5-f133-11eb-8d51-57dbc7220f98, both with static_id 9a1a2dae-0b5f-4c3d-8305-a268d404c306, post-migration, I now have rule with id alert:9a1a2dae-0b5f-4c3d-8305-a268d404c306 but with only namespaces: ['my-space-1']

@jportner
Copy link
Contributor

jportner commented Aug 2, 2021

I then tried combining the migrations, so that they would be done all as part of the 8.0.0 migration. This means converting the alert SO to share capable and adding this to the alerting migrations:

@ymao1 the migration in your comment attempts to change the ID of the alert itself. Doing that in consumer-defined migration functions is not supported, and it will break any inbound references and otherwise result in unwanted collisions like the one you described.

Instead have you considered changing the "static rule ID" to match the object ID? E.g.,

function updateIdForSecurityRules(
  doc: SavedObjectUnsanitizedDoc<RawAlert>
): SavedObjectUnsanitizedDoc<RawAlert> {
  const { id, attributes } = doc;
  const { tags = [], params } = attributes;

  const ruleStaticIdIndex = tags.findIndex((tag: string) => tag.startsWith('__internal_rule_id:'));
  const newTags = [...tags];
  const newParams = { ...params };
  if (ruleStaticIdIndex > -1) {
    // In case this object's ID has changed, synchronize the static rule ID fields with the object ID
    newTags[ruleStaticIdIndex] = `__internal_rule_id:${id}`;
    newParams.ruleId = id;
  }

  return {
    ...doc,
    attributes: {
      ...doc.attributes,
      tags: newTags,
      params: newParams,
    },
  };
}

Edit: note that this would mean all objects that existed before the migration will also exist after the migration. So if you have 2 rules before the migration, you will have 2 rules after the migration.

@ymao1
Copy link
Contributor

ymao1 commented Aug 2, 2021

Thanks for weighing in @jportner! Good to know that this would not be officially supported. For security rules, the static id for prebuilt detection rules are the same across all Kibana instances anywhere. Migrating so that the static ID becomes the object ID would break this contract, so I don't think this is an option. It sounds like they should wait until 8.x (after alert SOs become share-capable) before doing any migration for prebuilt detection rules.

@jportner
Copy link
Contributor

jportner commented Aug 2, 2021

Ah, thanks for clarifying. I read the linked issue now. Yes, they also mentioned preventing exports for those rules instead, which may be a better option.

@jportner
Copy link
Contributor

jportner commented Aug 3, 2021

Okay after some offline discussions, this seems to be the situation:

  • The Security Solution needed to be able to create rules with predefined object IDs, but the Alerting API did not support that at the time, so they resorted to these type-level "internal rule IDs" to keep things straight
  • Now the Alerting API supports creating rules with predefined object IDs
  • Now the Security Solution would like to synchronize these "internal rule IDs" with the object IDs (and eventually get rid of the internal rule IDs altogether?)

The only guaranteed way of creating "at most one copy of an object" in a space, both pre- and post-8.0, is to use the import() API. If the createNewCopies option is false (default), it will allow at most one copy of a given object in a given space. If an object has already been created in that space once, importing it a subsequent time will attempt to overwrite the old object.

We also refer to this as "pseudo-copy mode", as you aren't creating a new copy of an object each time you import, you are effectively ensuring that exactly one copy exists in the destination space. Under the hood, this relies on a special originId field for the saved object.

Perhaps instead of relying on SavedObjectsClient.create() the "Create Rule" API can rely on import() with pseudo-copy mode under the hood to support this use case. This would work in 7.16 and 8.0+.

However, this doesn't change the fact that, at the moment, rules' object IDs are completely divorced from their type-level "internal rule IDs".
One potential workaround: you could write a migration in 8.0 to check if the internal rule ID field exists, then overwrite the origin ID field with it. The origin ID is only used for import/copy so this is probably acceptable. It effectively defines where the object originated, so this seems like a valid use case. Something like this:

function updateOriginIdForSecurityRules(
  doc: SavedObjectUnsanitizedDoc<RawAlert>
): SavedObjectUnsanitizedDoc<RawAlert> {
  return {
    ...doc,
    originId: doc.attributes.params.ruleId ?? doc.originId
  };
}

Just to reiterate: the originId is ignored when using SavedObjectsClient.create(). It is only used when using import().

@mikecote
Copy link
Contributor Author

mikecote commented Aug 4, 2021

Okay after some offline discussions, this seems to be the situation:

  • The Security Solution needed to be able to create rules with predefined object IDs, but the Alerting API did not support that at the time, so they resorted to these type-level "internal rule IDs" to keep things straight
  • Now the Alerting API supports creating rules with predefined object IDs
  • Now the Security Solution would like to synchronize these "internal rule IDs" with the object IDs (and eventually get rid of the internal rule IDs altogether?)

That's correct.

For further context; the rule create API allows users to provide a custom ID to support capabilities like singleton rules (per space). An approach like this is preferred over lookup / create for situations we encounter race conditions (ex: two Kibanas creating a rule of ID “1” on startup and only wanting one to be created). This brings a few problems when making rule SOs share-capable:

  • In 8.0, saved objects don’t include the namespace in the ID, there may be interference between spaces when attempting to create a rule with the same ID in each space.
  • This creates a problem after migrating to 8.0 because the create API would not find existing rules with the same ID (because the previous rule IDs got re-generated).

We are leaning on maybe deprecating this capability and removing it altogether in 8.0. The use cases for pre-defined IDs sound more like needing a programmatical way to configure rules (ex: via kibana.yml) and pre-defined IDs doesn't seem to have much adoption as of today.

@ymao1
Copy link
Contributor

ymao1 commented Aug 4, 2021

Thanks @mikecote and @jportner. It looks like we would be able to remove the ability to specify pre-defined rule ID on create while still allowing a path forward for security rules to align with SO import/export using originId.

If security were able to add a migration to set originId to the static rule id for prebuilt detection rules, they could enable import/export and the import fn would detect any conflicts and rules would be overwritten (avoiding duplicates). What we would need to provide would be a way for rules to specify originId on create, which seems like a straightforward change to the rules client.

@jportner I looked at using import and it looks like saved objects import is expecting a read stream to read the imported objects from file? I did not see any client functions for import that looked usable by other plugins. Did I miss something?

@jportner
Copy link
Contributor

jportner commented Aug 5, 2021

@jportner I looked at using import and it looks like saved objects import is expecting a read stream to read the imported objects from file? I did not see any client functions for import that looked usable by other plugins. Did I miss something?

That's correct, it's not really intended to be exposed for other consumers, but the Spaces plugin actually uses it under the hood for copy-to-space functionality. The Spaces plugin simply creates a readable stream from the object array, then passes the stream to import(). Check it out for an example:

const copySavedObjectsToSpaces = async (
sourceSpaceId: string,
destinationSpaceIds: string[],
options: CopyOptions
): Promise<CopyResponse> => {
const response: CopyResponse = {};
const exportedSavedObjects = await exportRequestedObjects(sourceSpaceId, options);
const ineligibleTypes = getIneligibleTypes(getTypeRegistry());
const filteredObjects = exportedSavedObjects.filter(
({ type, namespaces }) =>
// Don't attempt to copy ineligible types or objects that already exist in all spaces
!ineligibleTypes.includes(type) && !namespaces?.includes(ALL_SPACES_ID)
);
for (const spaceId of destinationSpaceIds) {
const objectsToImport: SavedObject[] = [];
for (const { namespaces, ...object } of filteredObjects) {
if (!namespaces?.includes(spaceId)) {
// We check to ensure that each object doesn't already exist in the destination. If we don't do this, the consumer will see a
// conflict and have the option to skip or overwrite the object, both of which are effectively a no-op.
objectsToImport.push(object);
}
}
response[spaceId] = await importObjectsToSpace(
spaceId,
createReadableStreamFromArray(objectsToImport),
options
);
}
return response;
};

@ymao1
Copy link
Contributor

ymao1 commented Aug 9, 2021

After some research, we have decided to deprecate support for pre-defined rule ids via the HTTP API in 8.0. Previously, with single namespace rule saved object types, it was possible to create a rule with the same ID in multiple spaces. With the upcoming share-capable rules, it no longer makes sense to allow specification of pre-defined rule IDs via the API, since trying to create a rule in multiple spaces with the same ID will either (1) return an error or (2) create a rule with a different ID, depending on how we choose to handle it, neither of which actually achieves the goal of creating a rule in a space with a specific ID. In addition, since we are currently only planning to deprecate in the HTTP API, we still allow a path forward for internally allowing specification of predefined (or static) rule IDs for alerting plugin consumers like security.

Issues:

Finally, we've created an issue to discuss the best way to continue supporting pre-defined IDs in the rules client:

@ymao1
Copy link
Contributor

ymao1 commented Aug 9, 2021

Closing this research issue in favor of the followup issues linked in the previous comment.

@ymao1 ymao1 closed this as completed Aug 9, 2021
@gmmorris gmmorris added the estimate:needs-research Estimated as too large and requires research to break down into workable issues label Aug 18, 2021
@gmmorris gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting research Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

No branches or pull requests

6 participants