Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Siem signals -> alerts as data field and index aliases #106049

Merged
merged 39 commits into from
Aug 5, 2021
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
010f093
Add aliases mapping signal fields to alerts as data fields
marshallmain Jul 16, 2021
c4edc7d
Add aliases mapping alerts as data fields to signal fields
marshallmain Jul 16, 2021
f943313
Replace siem signals templates per space and add AAD index aliases to…
marshallmain Jul 17, 2021
5f11d31
Remove first version of new mapping json file
marshallmain Jul 19, 2021
979ad9a
Merge branch 'master' into signal-aad-aliases
marshallmain Jul 19, 2021
3148569
Convert existing legacy siem-signals templates to new ES templates
marshallmain Jul 26, 2021
3e08e73
Catch 404 if siem signals templates were already updated
marshallmain Jul 26, 2021
58f7464
Enhance error message when index exists but is not write index for alias
marshallmain Jul 26, 2021
1788952
Merge branch 'master' into signal-aad-aliases
marshallmain Jul 26, 2021
9377e0c
Check if alias write index exists before creating new write index
marshallmain Jul 26, 2021
df9b010
More robust write target creation logic
marshallmain Jul 26, 2021
bad2321
Add RBAC required fields for AAD to siem signals indices
marshallmain Jul 27, 2021
b837f27
Fix index name in index mapping update
marshallmain Jul 28, 2021
94f9dee
Merge branch 'master' into signal-aad-aliases
marshallmain Jul 28, 2021
28722b1
Throw errors if bulk retry fails or existing indices are not writeable
marshallmain Jul 28, 2021
185f4ba
Add new template to routes even without experimental rule registry fl…
marshallmain Jul 30, 2021
bf89a05
Merge branch 'master' into signal-aad-aliases
marshallmain Jul 30, 2021
30ac118
Check template version before updating template
marshallmain Jul 30, 2021
3c6f7b7
First pass at modifying routes to handle inserting field aliases
marshallmain Jul 30, 2021
3c26be4
Always insert field aliases when create_index_route is called
marshallmain Jul 30, 2021
f1f5ddc
Update snapshot test
marshallmain Jul 30, 2021
58d0e01
Remove template update logic from plugin setup
marshallmain Jul 30, 2021
e8f464c
Use aliases_version field to detect if aliases need update
marshallmain Aug 2, 2021
3054481
Fix bugs
marshallmain Aug 2, 2021
d7bc0da
Merge branch 'master' into signal-aad-aliases
kibanamachine Aug 2, 2021
0910131
oops update snapshot
marshallmain Aug 2, 2021
3827188
Use internal user for PUT alias to fix perms issue
marshallmain Aug 2, 2021
0cd9b83
Update comment
marshallmain Aug 2, 2021
5973dde
Disable new resource creation if ruleRegistryEnabled
marshallmain Aug 3, 2021
ded440e
Only attempt to add aliases if siem-signals index already exists
marshallmain Aug 3, 2021
99d83ee
Merge branch 'master' into signal-aad-aliases
marshallmain Aug 3, 2021
8e7e00e
Fix types, add aliases to aad indices, use package field names
marshallmain Aug 3, 2021
c033517
Undo adding aliases to AAD indices
marshallmain Aug 3, 2021
fc475fd
Remove unused import
marshallmain Aug 3, 2021
5828090
Update test and snapshot oops
marshallmain Aug 3, 2021
4275da7
Filter out kibana.* fields from generated signals
marshallmain Aug 4, 2021
0214b61
Update cypress test to account for new fields in table
marshallmain Aug 4, 2021
498f9c4
Properly handle space ids with dashes in them
marshallmain Aug 4, 2021
6bcb6ae
Merge branch 'master' into signal-aad-aliases
kibanamachine Aug 5, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 15 additions & 21 deletions x-pack/plugins/rule_registry/server/rule_data_client/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,8 @@ export class RuleDataClient implements IRuleDataClient {
if (response.body.errors) {
if (
response.body.items.length > 0 &&
response.body.items?.[0]?.index?.error?.type === 'index_not_found_exception'
(response.body.items?.[0]?.index?.error?.type === 'index_not_found_exception' ||
response.body.items?.[0]?.index?.error?.type === 'illegal_argument_exception')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

illegal_argument_exception is returned if the alias exists but has no write_index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately illegal_argument_exception can be returned for many reasons other than the lack of a write index. I can imagine two ways to reduce the risk of misinterpretation:

  1. Parse the error message (has other risks).
  2. Query the alias for is_write_alias.

Do you think it's safe enough to just assume the reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I'll add logic similar to the old "alias exists" check in createWriteTargetIfNeeded but instead checking to ensure the alias has no write target before attempting to create a write target for it. You're right that otherwise we could end up with illegal_argument_exceptions that cause us to try creating a new write target and that would cause problems if the alias already has a write target.

Copy link
Contributor Author

@marshallmain marshallmain Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the aliasExists check with a more general "index exists" check on the AAD indices. This makes the createWriteTargetIfNeeded function safer to call optimistically when we encounter index_not_found or illegal_argument exceptions while still allowing the security solution to add AAD aliases to our .siem-signals indices. Now, if we do get an illegal_argument_exception for a reason other than "no concrete AAD indices exist", then createWriteTargetIfNeeded will return without making new indices and potentially making things worse.

) {
return this.createWriteTargetIfNeeded({ namespace }).then(() => {
return clusterClient.bulk(requestWithDefaultParameters);
Expand All @@ -116,29 +117,22 @@ export class RuleDataClient implements IRuleDataClient {

const clusterClient = await this.getClusterClient();

const { body: aliasExists } = await clusterClient.indices.existsAlias({
name: alias,
});

const concreteIndexName = `${alias}-000001`;

if (!aliasExists) {
Copy link
Contributor Author

@marshallmain marshallmain Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this check so that we can add the .alerts-security.alerts alias to the .siem-signals indices before the .alerts-security.alerts concrete indices are actually created. Otherwise, we would have to wait until the first new alert was written to .alerts before we could add the alias to .siem-signals. This way we can always treat all existing alerts as though they are in .alerts and don't have to worry about handling old and new alerts separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check has now been replaced with a more specific check - querying to find concrete indices rather than the alias.

try {
await clusterClient.indices.create({
index: concreteIndexName,
body: {
aliases: {
[alias]: {
is_write_index: true,
},
try {
await clusterClient.indices.create({
index: concreteIndexName,
body: {
aliases: {
[alias]: {
is_write_index: true,
},
},
});
} catch (err) {
// something might have created the index already, that sounds OK
if (err?.meta?.body?.error?.type !== 'resource_already_exists_exception') {
throw err;
}
},
});
} catch (err) {
// something might have created the index already, that sounds OK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the index already exists, but isn't set as the write index for the alias? Wouldn't we have to add the alias here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get into a scenario where the index exists but the alias has no write target, other than by an admin manually creating the index but creating it incorrectly? My assumption was that this function is the only place where the .alerts index can be created so we can rely on it being correctly associated with the alias as the write index if it exists. I was under the impression that the error suppression here was to avoid throwing an error if multiple rule executors attempt to create the index at the same time.

Perhaps if we get a resource_already_exists_exception we could fetch the index settings to ensure that whatever created the index created it correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a way this code would produce inconsistent alias settings, but I've seen many situations in which backup scripts have not correctly restored all aliases. We might not be able to automatically fix all such situations, but we should try to to make the error message as useful as possible IMHO.

What if we split setting the write index for an existing index into a separate function and call it or createWriteTargetIfNeeded depending on the original error message (index_not_found_exception or illegal_argument_exceptions + alias check)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both index_not_found_exception and illegal_argument_exception we will want to be able to create a new index to be the write target. Can we punt on adding the alias to existing indices for now and just log the errors? For the security solution use case, we need to be able to create a write index when the alias already exists and has no write index, but we don't need to support adding the alias to existing AAD indices.

Going back to the original question

What if the index already exists, but isn't set as the write index for the alias? Wouldn't we have to add the alias here?

I think not - if the index exists but isn't set as the write index, something has gone wrong but I don't think we have enough information to safely attempt to automatically fix it. If this happens due to a scenario like backup scripts not restoring everything correctly, then (with the latest changes I pushed up) the initial bulk index would fail, we'd call createWriteTargetIfNeeded, and it would find the existing AAD indices even if they aren't aliased correctly so it would not modify the alias or make any indices. The writer would then attempt to write again, fail again, and throw the error up to the rule executor to be logged and maybe displayed in the UI. At that point the error should be available to bubble up to users and tell them that their alias has no write index. Does that sound reasonable?

Re: automatically fixing things, I'm imagining a case where a user restores .alerts-000002 and .alerts-000003 from backups but accidentally sets neither as the write index. If we optimistically create .alerts-000001 as the new write index and set it up with ILM, then everything works fine for a while but eventually when ES attempts to rollover to .alerts-000002 the rollover fails and would require manual intervention to get back to normal. To me, it seems worse to have the system seem to work normally for a while and then fail much later since it would be harder to figure out the root cause (it's easier to see the correlation between restoring backups and failing to write alerts if there's not a long time between the restore and when alert writes started failing).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good points 👍 showing a relatively accurate error to the user should be enough for now

if (err?.meta?.body?.error?.type !== 'resource_already_exists_exception') {
throw err;
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,10 @@ import signalsPolicy from './signals_policy.json';
import { templateNeedsUpdate } from './check_template_version';
import { getIndexVersion } from './get_index_version';
import { isOutdated } from '../../migrations/helpers';
import { parseExperimentalConfigValue } from '../../../../../common/experimental_features';
import { ConfigType } from '../../../../config';

export const createIndexRoute = (router: SecuritySolutionPluginRouter) => {
export const createIndexRoute = (router: SecuritySolutionPluginRouter, config: ConfigType) => {
router.post(
{
path: DETECTION_ENGINE_INDEX_URL,
Expand All @@ -37,6 +39,10 @@ export const createIndexRoute = (router: SecuritySolutionPluginRouter) => {
},
},
async (context, request, response) => {
const { ruleRegistryEnabled } = parseExperimentalConfigValue(config.enableExperimental);
if (ruleRegistryEnabled) {
return response.ok({ body: { acknowledged: true } });
}
const siemResponse = buildSiemResponse(response);

try {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
{
"signal.ancestors.depth": "kibana.alert.ancestors.depth",
"signal.ancestors.id": "kibana.alert.ancestors.id",
"signal.ancestors.index": "kibana.alert.ancestors.index",
"signal.ancestors.type": "kibana.alert.ancestors.type",
"signal.depth": "kibana.alert.depth",
"signal.original_event.action": "kibana.alert.original_event.action",
"signal.original_event.category": "kibana.alert.original_event.category",
"signal.original_event.code": "kibana.alert.original_event.code",
"signal.original_event.created": "kibana.alert.original_event.created",
"signal.original_event.dataset": "kibana.alert.original_event.dataset",
"signal.original_event.duration": "kibana.alert.original_event.duration",
"signal.original_event.end": "kibana.alert.original_event.end",
"signal.original_event.hash": "kibana.alert.original_event.hash",
"signal.original_event.id": "kibana.alert.original_event.id",
"signal.original_event.kind": "kibana.alert.original_event.kind",
"signal.original_event.module": "kibana.alert.original_event.module",
"signal.original_event.outcome": "kibana.alert.original_event.outcome",
"signal.original_event.provider": "kibana.alert.original_event.provider",
"signal.original_event.risk_score": "kibana.alert.original_event.risk_score",
"signal.original_event.risk_score_norm": "kibana.alert.original_event.risk_score_norm",
"signal.original_event.sequence": "kibana.alert.original_event.sequence",
"signal.original_event.severity": "kibana.alert.original_event.severity",
"signal.original_event.start": "kibana.alert.original_event.start",
"signal.original_event.timezone": "kibana.alert.original_event.timezone",
"signal.original_event.type": "kibana.alert.original_event.type",
"signal.original_time": "kibana.alert.original_time",
"signal.rule.author": "kibana.alert.rule.author",
"signal.rule.building_block_type": "kibana.alert.rule.building_block_type",
"signal.rule.created_at": "kibana.alert.rule.created_at",
"signal.rule.created_by": "kibana.alert.rule.created_by",
"signal.rule.description": "kibana.alert.rule.description",
"signal.rule.enabled": "kibana.alert.rule.enabled",
"signal.rule.false_positives": "kibana.alert.rule.false_positives",
"signal.rule.from": "kibana.alert.rule.from",
"signal.rule.id": "kibana.alert.rule.id",
"signal.rule.immutable": "kibana.alert.rule.immutable",
"signal.rule.index": "kibana.alert.rule.index",
"signal.rule.interval": "kibana.alert.rule.interval",
"signal.rule.language": "kibana.alert.rule.language",
"signal.rule.license": "kibana.alert.rule.license",
"signal.rule.max_signals": "kibana.alert.rule.max_signals",
"signal.rule.name": "kibana.alert.rule.name",
"signal.rule.note": "kibana.alert.rule.note",
"signal.rule.query": "kibana.alert.rule.query",
"signal.rule.references": "kibana.alert.rule.references",
"signal.rule.risk_score": "kibana.alert.risk_score",
"signal.rule.risk_score_mapping.field": "kibana.alert.rule.risk_score_mapping.field",
"signal.rule.risk_score_mapping.operator": "kibana.alert.rule.risk_score_mapping.operator",
"signal.rule.risk_score_mapping.value": "kibana.alert.rule.risk_score_mapping.value",
"signal.rule.rule_id": "kibana.alert.rule.rule_id",
"signal.rule.rule_name_override": "kibana.alert.rule.rule_name_override",
"signal.rule.saved_id": "kibana.alert.rule.saved_id",
"signal.rule.severity": "kibana.alert.severity",
"signal.rule.severity_mapping.field": "kibana.alert.rule.severity_mapping.field",
"signal.rule.severity_mapping.operator": "kibana.alert.rule.severity_mapping.operator",
"signal.rule.severity_mapping.value": "kibana.alert.rule.severity_mapping.value",
"signal.rule.severity_mapping.severity": "kibana.alert.rule.severity_mapping.severity",
"signal.rule.tags": "kibana.alert.rule.tags",
"signal.rule.threat.framework": "kibana.alert.rule.threat.framework",
"signal.rule.threat.tactic.id": "kibana.alert.rule.threat.tactic.id",
"signal.rule.threat.tactic.name": "kibana.alert.rule.threat.tactic.name",
"signal.rule.threat.tactic.reference": "kibana.alert.rule.threat.tactic.reference",
"signal.rule.threat.technique.id": "kibana.alert.rule.threat.technique.id",
"signal.rule.threat.technique.name": "kibana.alert.rule.threat.technique.name",
"signal.rule.threat.technique.reference": "kibana.alert.rule.threat.technique.reference",
"signal.rule.threat.technique.subtechnique.id": "kibana.alert.rule.threat.technique.subtechnique.id",
"signal.rule.threat.technique.subtechnique.name": "kibana.alert.rule.threat.technique.subtechnique.name",
"signal.rule.threat.technique.subtechnique.reference": "kibana.alert.rule.threat.technique.subtechnique.reference",
"signal.rule.threat_index": "kibana.alert.rule.threat_index",
"signal.rule.threat_indicator_path": "kibana.alert.rule.threat_indicator_path",
"signal.rule.threat_language": "kibana.alert.rule.threat_language",
"signal.rule.threat_mapping.entries.field": "kibana.alert.rule.threat_mapping.entries.field",
"signal.rule.threat_mapping.entries.value": "kibana.alert.rule.threat_mapping.entries.value",
"signal.rule.threat_mapping.entries.type": "kibana.alert.rule.threat_mapping.entries.type",
"signal.rule.threat_query": "kibana.alert.rule.threat_query",
"signal.rule.threshold.field": "kibana.alert.rule.threshold.field",
"signal.rule.threshold.value": "kibana.alert.rule.threshold.value",
"signal.rule.timeline_id": "kibana.alert.rule.timeline_id",
"signal.rule.timeline_title": "kibana.alert.rule.timeline_title",
"signal.rule.to": "kibana.alert.rule.to",
"signal.rule.type": "kibana.alert.rule.type",
"signal.rule.updated_at": "kibana.alert.rule.updated_at",
"signal.rule.updated_by": "kibana.alert.rule.updated_by",
"signal.rule.version": "kibana.alert.rule.version",
"signal.status": "kibana.alert.workflow_status",
"signal.threshold_result.from": "kibana.alert.threshold_result.from",
"signal.threshold_result.terms.field": "kibana.alert.threshold_result.terms.field",
"signal.threshold_result.terms.value": "kibana.alert.threshold_result.terms.value",
"signal.threshold_result.cardinality.field": "kibana.alert.threshold_result.cardinality.field",
"signal.threshold_result.cardinality.value": "kibana.alert.threshold_result.cardinality.value",
"signal.threshold_result.count": "kibana.alert.threshold_result.count"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
{
"signal": {
"type": "object",
"properties": {
"_meta": {
"type": "object",
"properties": {
"version": {
"type": "long"
}
}
},
"ancestors": {
"properties": {
"rule": {
"type": "keyword"
},
"index": {
"type": "keyword"
},
"id": {
"type": "keyword"
},
"type": {
"type": "keyword"
},
"depth": {
"type": "long"
}
}
},
"depth": {
"type": "integer"
},
"group": {
"type": "object",
"properties": {
"id": {
"type": "keyword"
},
"index": {
"type": "integer"
}
}
},
"rule": {
"type": "object",
"properties": {
"author": {
"type": "keyword"
},
"building_block_type": {
"type": "keyword"
},
"license": {
"type": "keyword"
},
"note": {
"type": "text"
},
"risk_score_mapping": {
"type": "object",
"properties": {
"field": {
"type": "keyword"
},
"operator": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
},
"rule_name_override": {
"type": "keyword"
},
"severity_mapping": {
"type": "object",
"properties": {
"field": {
"type": "keyword"
},
"operator": {
"type": "keyword"
},
"value": {
"type": "keyword"
},
"severity": {
"type": "keyword"
}
}
},
"threat": {
"type": "object",
"properties": {
"technique": {
"type": "object",
"properties": {
"subtechnique": {
"type": "object",
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"reference": {
"type": "keyword"
}
}
}
}
}
}
},
"threat_index": {
"type": "keyword"
},
"threat_indicator_path": {
"type": "keyword"
},
"threat_language": {
"type": "keyword"
},
"threat_mapping": {
"type": "object",
"properties": {
"entries": {
"type": "object",
"properties": {
"field": {
"type": "keyword"
},
"value": {
"type": "keyword"
},
"type": {
"type": "keyword"
}
}
}
}
},
"threat_query": {
"type": "keyword"
},
"threshold": {
"type": "object",
"properties": {
"field": {
"type": "keyword"
},
"value": {
"type": "float"
}
}
}
}
},
"threshold_result": {
"properties": {
"from": {
"type": "date"
},
"terms": {
"properties": {
"field": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
},
"cardinality": {
"properties": {
"field": {
"type": "keyword"
},
"value": {
"type": "long"
}
}
},
"count": {
"type": "long"
}
}
}
}
}
}
Loading