-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RAC] Add mapping update logic to RuleDataClient #102586
Changes from 4 commits
4b78e40
33461c1
19b9eaf
3a1fe7c
bbd8335
4ccf63a
c9c7a87
cd184c3
a45a70a
be0f777
c8e2642
7a4bc11
efcf928
852a8b0
bae03df
5e99679
0be60e1
a87dd50
88a4d9e
c1aaa5e
d3f099a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,9 +5,9 @@ | |
* 2.0. | ||
*/ | ||
|
||
import { isEmpty } from 'lodash'; | ||
import type { estypes } from '@elastic/elasticsearch'; | ||
import { IndicesRolloverResponse } from '@elastic/elasticsearch/api/types'; | ||
import { ResponseError } from '@elastic/elasticsearch/lib/errors'; | ||
import { get } from 'lodash'; | ||
import { IndexPatternsFetcher } from '../../../../../src/plugins/data/server'; | ||
import { | ||
IRuleDataClient, | ||
|
@@ -70,9 +70,10 @@ export class RuleDataClient implements IRuleDataClient { | |
}; | ||
} | ||
|
||
getWriter(options: { namespace?: string } = {}): RuleDataWriter { | ||
async getWriter(options: { namespace?: string } = {}): Promise<RuleDataWriter> { | ||
const { namespace } = options; | ||
const alias = getNamespacedAlias({ alias: this.options.alias, namespace }); | ||
await this.rolloverAliasIfNeeded({ namespace }); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
const alias = getNamespacedAlias({ alias: this.options.alias, namespace });
await this.rolloverAliasIfNeeded(alias); There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why Can/should we combine the two into a single There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah we can just pass in the alias directly, good point. The call to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
👍 yeah, understood that part, agree As for deferring Both options work for me, I think this is not super important, just thought that maybe combining them into a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having Let's consider solution-side code that performs many writes during a request or rule execution. A single log write operation could look like that: await ruleDataClientInstance.getWriter({ namespace }).bulk(logsToWrite); And I can imagine a rule that performs tens of writes like that during the execution. From the solution developer's perspective, nothing bad happens; they write logs N times. But under the hood, every write request is accompanied by at least three additional requests (total requests = N*4): // Expected request
await clusterClient.bulk()
// Additional "hidden" requests
await clusterClient.indices.rollover({ dry_run: true })
await clusterClient.indices.get()
await clusterClient.indices.simulateIndexTemplate() To avoid making these extra requests on every write, the developer should instantiate the writer and reuse it. But I think it's (a) not always convenient to do, and (b) the developer should know implementation details of We can envisage use cases as above and protect API users from harming performance by having a map of instantiated writers ( What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's I'd say how it should be done. If you're in an executor function, you will get a Same if you need to log stuff from route handlers.
Hehe, that's similar to what I had in my draft :) I think this could be handy in general if someone would always use the API like that: I'd say let's consider that in a follow-up PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah this is an area that is a little bit tricky. I had in mind adding some dev docs to explain the expected usage pattern (create a writer inside the alert executor and reuse it for the duration of the executor). Typescript does do a little bit more to guide devs towards that pattern naturally - since const writer = await ruleDataClientInstance.getWriter({ namespace });
await writer.bulk(logsToWrite); which might help clue devs in that writers are meant to be reused. On the flip side, the potential issue with caching the writers and therefore only running the rollover logic once at instantiation time is that if the mappings change while Kibana is running then we won't detect it unless we add a way to manually re-check the mapping versions for an index. Our built-in mappings won't change this way, but we've had multiple cases of advanced users modifying the mappings to add their own custom fields. If we do add the user-defined templates (#101016 (comment)) then those would be changing while Kibana is running - so we'd want to check the template versions frequently to make sure changes to those templates get propagated to the indices. So in my mind the RuleDataWriter is designed to live for a few seconds, e.g. for the duration of a rule execution, after which it should be discarded and a new one instantiated. My hope was that with docs we can provide simple guidance about creating and using a single writer within the rule executor and devs that way won't have to worry too much about the internals - they just know that you make 1 writer in the executor. Maybe we could also add an optional parameter to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, it should be done that way. But my point is that the current API design doesn't enforce that. And in fact, it is much easier to use the API in the wrong way. Currently, all classes/functions depend on I think a well-designed system should make it easy to do the right things and hard to do the wrong things. That's why I'm highlighting the current API as it potentially could be misused. But, as we have plans to improve RuleDataClient in a follow PR, let's discuss the API design separately. It is not a blocker for current changes to be merged. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will try to address this concern in the next PR within #101016 (will be a parallel PR to this one). The idea is to bind
|
||
return { | ||
bulk: async (request) => { | ||
const clusterClient = await this.getClusterClient(); | ||
|
@@ -89,7 +90,7 @@ export class RuleDataClient implements IRuleDataClient { | |
response.body.items.length > 0 && | ||
response.body.items?.[0]?.index?.error?.type === 'index_not_found_exception' | ||
) { | ||
return this.createOrUpdateWriteTarget({ namespace }).then(() => { | ||
return this.createWriteTargetIfNeeded({ namespace }).then(() => { | ||
return clusterClient.bulk(requestWithDefaultParameters); | ||
}); | ||
} | ||
|
@@ -102,15 +103,78 @@ export class RuleDataClient implements IRuleDataClient { | |
}; | ||
} | ||
|
||
async createOrUpdateWriteTarget({ namespace }: { namespace?: string }) { | ||
async rolloverAliasIfNeeded({ namespace }: { namespace?: string }) { | ||
const clusterClient = await this.getClusterClient(); | ||
const alias = getNamespacedAlias({ alias: this.options.alias, namespace }); | ||
let simulatedRollover: IndicesRolloverResponse; | ||
try { | ||
({ body: simulatedRollover } = await clusterClient.indices.rollover({ | ||
alias, | ||
dry_run: true, | ||
})); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A thought on race conditions between different parts of index bootstrapping in this implementation. If I noticed that correctly, in this implementation there are 3 places where certain parts of index bootstrapping logic are implemented:
We need to guarantee that While I think this seems to be implemented correctly, it took a while for me to trace all those parts of bootstrapping, dependencies between them, how/where different Wondering if we could make it at least slightly more explicit. Off the top of my head:
Ideally would be nice to encapsulate the whole bootstrapping (specific to a log) in a single class. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I definitely agree that there's room for improvement on the handling of ready signals here. To me, the biggest source of confusion/possible future bugs is that the solution plugins must define a ready signal as part of step 2 (Bootstrapping specific to a security solution log) and that ready signal is then passed in to the RuleDataClient. So steps 1 and 3 are handled by the rule registry infrastructure, but solution devs still have to have fairly detailed knowledge of how the bootstrap process works to understand the requirements around "ready signal" they must implement as part of step 2. To solve this, I envision adding a function to the I think with this approach, we'd still need the ready signal internal to the RuleDataPluginService to ensure that calls to Thoughts? Do you think this would sufficiently encapsulate and simplify the bootstrap process? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also I'm hoping we can defer the actual implementation of this encapsulation to a follow up PR There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, this could be an option (if you really don't like the declarative way of resolving log definitions :P ). Injecting a hook. Btw, One thing I'd keep in mind when implementing this is #102586 (comment) - how do we make sure that inside the hook a developer doesn't have to reference common templates manually.
Yeah, of course, let's merge this PR sooner! 🚢 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking of |
||
} catch (err) { | ||
if (err?.meta?.body?.error?.type !== 'index_not_found_exception') { | ||
throw err; | ||
} | ||
return; | ||
} | ||
|
||
const [writeIndexMapping, simulatedIndexMapping] = await Promise.all([ | ||
clusterClient.indices.get({ | ||
index: simulatedRollover.old_index, | ||
}), | ||
clusterClient.indices.simulateIndexTemplate({ | ||
name: simulatedRollover.new_index, | ||
}), | ||
]); | ||
const currentVersions = get(writeIndexMapping, [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't see much difference since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can add (minimal) types for |
||
'body', | ||
simulatedRollover.old_index, | ||
'mappings', | ||
'_meta', | ||
'versions', | ||
]); | ||
const targetVersions = get(simulatedIndexMapping, [ | ||
'body', | ||
'template', | ||
'mappings', | ||
'_meta', | ||
'versions', | ||
]); | ||
const componentTemplateRemoved = | ||
Object.keys(currentVersions).length > Object.keys(targetVersions).length; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line fails with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting, how's that possible. Both current and simulated indices should have this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it happened when switching from another branch to this one while using the same ES cluster? I'm extracting this logic into a function and making it a little more robust with guards around the cases where currentVersions or targetVersions are undefined. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my case, there was no |
||
const componentTemplateUpdated = Object.entries(targetVersions).reduce( | ||
(acc, [templateName, targetTemplateVersion]) => { | ||
const currentTemplateVersion = get(currentVersions, templateName); | ||
const templateUpdated = | ||
currentTemplateVersion == null || currentTemplateVersion < Number(targetTemplateVersion); | ||
return acc || templateUpdated; | ||
}, | ||
false | ||
); | ||
marshallmain marked this conversation as resolved.
Show resolved
Hide resolved
|
||
const needRollover = componentTemplateRemoved || componentTemplateUpdated; | ||
if (needRollover) { | ||
try { | ||
await clusterClient.indices.rollover({ | ||
alias, | ||
new_index: simulatedRollover.new_index, | ||
}); | ||
} catch (err) { | ||
if (err?.meta?.body?.error?.type !== 'resource_already_exists_exception') { | ||
throw err; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just curious, is there any doc describing exceptions from different methods of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not aware of any...it would be amazing if we had one though. |
||
} | ||
} | ||
} | ||
|
||
async createWriteTargetIfNeeded({ namespace }: { namespace?: string }) { | ||
const alias = getNamespacedAlias({ alias: this.options.alias, namespace }); | ||
|
||
const clusterClient = await this.getClusterClient(); | ||
|
||
const { body: aliasExists } = await clusterClient.indices.existsAlias({ | ||
name: alias, | ||
}); | ||
|
||
const concreteIndexName = `${alias}-000001`; | ||
|
||
if (!aliasExists) { | ||
|
@@ -132,20 +196,5 @@ export class RuleDataClient implements IRuleDataClient { | |
} | ||
} | ||
} | ||
|
||
const { body: simulateResponse } = await clusterClient.transport.request({ | ||
method: 'POST', | ||
path: `/_index_template/_simulate_index/${concreteIndexName}`, | ||
}); | ||
|
||
const mappings: estypes.MappingTypeMapping = simulateResponse.template.mappings; | ||
|
||
if (isEmpty(mappings)) { | ||
throw new Error( | ||
'No mappings would be generated for this index, possibly due to failed/misconfigured bootstrapping' | ||
); | ||
} | ||
Comment on lines
-155
to
-158
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this should be removed. I don't see the same check in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sort of check seems like it belongs with the template bootstrapping logic IMO. At this point we've already created the index, so simulating it seems redundant. Can we move this check so it happens sometime before the index is created, maybe at index template creation time? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm fine with moving it, but I do would like to keep it, somewhere. But let's figure out first where this all needs to happen. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There should be always some common mappings specified (currently they are specified in templates installed in I'd say this is a matter of having an API that doesn't provide 100% flexibility in index bootstrapping and makes sure that index template is always created and it always references the common templates - doesn't matter how the client of the library uses it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moved this check to index template creation time in 4ccf63a |
||
|
||
await clusterClient.indices.putMapping({ index: `${alias}*`, body: mappings }); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why we use parameter object + destructuring in private functions when there's only 2 parameters? Wouldn't it be cleaner to
getNamespacedAlias(this.options.alias, namespace)
?I understand when some public API accepts options via an object (e.g. the
async getWriter(options: { namespace?: string } = {}): Promise<RuleDataWriter> {
above). You probably want extensibility etc. But for internal or private functions this is unnecessary if the number of parameters is <= 3 imho.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion either way. Sometimes it's nice to see the parameter names at the function call site as a reminder of what each one is. I just left it the way it is to follow the existing pattern.