-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Refactor ML mappings and templates into JSON resources #51765
[ML] Refactor ML mappings and templates into JSON resources #51765
Conversation
Pinging @elastic/ml-core (:ml) |
try { | ||
BytesReference source = load(resource); | ||
source = replaceVariables(source, version, versionProperty, variables); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gwbrown Here are the changes I made to allow variable replacement on template loading. Could you take a look please when you have some time?
@@ -887,101 +881,6 @@ protected Clock getClock() { | |||
public UnaryOperator<Map<String, IndexTemplateMetaData>> getIndexTemplateMetaDataUpgrader() { | |||
return templates -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not replaced this with returning an operator that does nothing yet because it is unclear whether IndexTemplateRegistry
fully addresses upgrading issues. I'm waiting to hear on this, until then the PR is WIP.
private static final String CLUSTER_NAME = "myCluster"; | ||
|
||
@SuppressWarnings("unchecked") | ||
public void testCreateJobResultsIndex() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are unfortunately the result of extreme mocking, something we tried in the past but doesn't seem to bring benefits. I have removed them and replaced them with integration tests in JobResultsProviderIT
.
@@ -1031,6 +855,31 @@ public void testDatafeedTimingStats_NotFound() throws IOException { | |||
verifyNoMoreInteractions(client); | |||
} | |||
|
|||
@SuppressWarnings("unchecked") | |||
// public void testTermFieldMapping() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed this one, I forgot to address it. I'll take care of it before removing the WIP label.
@@ -77,14 +56,11 @@ | |||
private ClusterAdminClient clusterAdminClient; | |||
private IndicesAdminClient indicesAdminClient; | |||
|
|||
private IndicesAliasesRequestBuilder aliasesRequestBuilder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In hindsight I'm not sure this mock client class was useful. I remove all unused methods.
|
||
private static final IndexTemplateConfig CONFIG_TEMPLATE = configTemplate(); | ||
|
||
private static final IndexTemplateConfig INFERENCE_TEMPLATE = new IndexTemplateConfig(InferenceIndexConstants.LATEST_INDEX_NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This template's name is formed by the inference prefix plus the latest version. This would mean that once we change version we'll create a new template. I think that's probably not a good idea. I left it as is for now but we can address in this PR. In fact, our template naming is a bit inconsistent but I'm not sure we can iron it easily. I'll make sure we discuss this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple comments about escaping in the changes to TemplateUtils
/IndexTemplateConfig
, but I think once those are addressed the additional variable support will be valuable.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/template/IndexTemplateConfig.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/template/TemplateUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a few nits plus there's a question of what we should do when the plugin is disabled.
Let me know if the test failures are showing up any unforeseen edge cases that need talking through.
createIndexRequest.settings(Settings.builder() | ||
.put(IndexMetaData.SETTING_AUTO_EXPAND_REPLICAS, "0-1") | ||
.put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, "1") | ||
.put(UnassignedInfo.INDEX_DELAYED_NODE_LEFT_TIMEOUT_SETTING.getKey(), delayedNodeTimeOutSetting)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the templates we decided to leave this out in the future. Can it be omitted here too now we're explicitly setting it in the affected tests? (Plus the code above that looks for whether we're using the black hole autodetect process.)
@@ -0,0 +1,47 @@ | |||
{ | |||
"order" : 0, | |||
"version" : 8000099, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a variable?
|
||
private static final IndexTemplateConfig ANOMALY_DETECTION_RESULTS_TEMPLATE = anomalyDetectionResultsTemplate(); | ||
|
||
private static final IndexTemplateConfig ANOMALY_DETECTION_STATE_TEMPLATE = new IndexTemplateConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the same state index for data frame analytics state too, so maybe drop the ANOMALY_DETECTION_
bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to move all state-related logic in AnomalyDetectorsIndex
into a MlStateIndex
class. But I'd rather do that in a follow-up PR.
@Override | ||
protected List<IndexTemplateConfig> getTemplateConfigs() { | ||
if (mlEnabled == false) { | ||
return Collections.emptyList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a difference compared to what we did with the old getIndexTemplateMetaDataUpgrader()
functionality. In that we used to return our templates even when ML was disabled on the node. Returning nothing when ML is disabled isn't necessarily wrong, but it's a difference that needs thinking about.
There may be some subtle effect in Cloud for example. I think at one time there was a proposal that when a Cloud cluster gets switched to different hardware and a full cluster snapshot from the old hardware is restored into brand new empty cluster running on different hardware that all the X-Pack plugins are disabled during that restore. I am not sure if this is still being proposed or not. But it's a scenario where the ML plugin would be disabled in a cluster that contained ML indices.
Let's consult more widely on this offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was wrong to return an empty list. A number of tests was failing because of that. As long as the plugin is loaded, the templates should be there as some nodes may become ML nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, added 2 comments.
validate(source); | ||
|
||
return filter(source, version, versionProperty); | ||
return source.utf8ToString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a lot of conversions and deep copies: utf8->utf16->utf8->utf16. It does not look to hard to implement replaceVariables
and validate
with strings instead of utf8 byte buffers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/template/TemplateUtils.java
Show resolved
Hide resolved
The changes in b765370 LGTM. Thank you! |
I believe the failing tests in |
ML mappings and index templates have so far been created programmatically. While this had its merits due to static typing, there is consensus it would be clear to maintain those in json files. In addition, we are going to adding ILM policies to these indices and the component for a plugin to register ILM policies is `IndexTemplateRegistry`. It expects the templates to be in resource json files. For the above reasons this commit refactors ML mappings and index templates into json resource files that are registered via `MlIndexTemplateRegistry`.
3b553d8
to
7ace93c
Compare
This has now been rebased to pick #51456. Let's see if it passes all the tests. |
Changes in 0aba76e also LGTM. Thanks again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (source == null) { | ||
throw new ElasticsearchParseException("Template must not be null"); | ||
} | ||
if (Strings.isEmpty(source)) { | ||
throw new ElasticsearchParseException("Template must not be empty"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I wonder about that: this exception type is used if a REST request is malformed. In this context this makes no sense and we consider the input immutable, right? IMHO this is more of a RTE kind of thing.
Anyway, this is out of scope for this PR, you just add another exception and follow the existing style.
@@ -0,0 +1,36 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we put comments in such files? I think it would be good to leave some explanation to the reader of this. I would also add a placeholder for a changelog, in case we change the schema.
^ applies to all json files introduced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no support for comments in JSON files as it stands. We could in theory support them by removing them comment lines when such files are loaded. As for a changelog, I think one of the benefits for those being files, is that we treat them similarly to code files. Thus, git serves as a change log. We can git-blame them and see the commit where each line was introduced/modified. Not sure we need more. But let's see how it goes and we add them if it appears to be needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Since the CI is green and this change is blocking two other pieces of work from starting I would be happy to merge this as-is and do any additional minor tweaks in followup PRs.
…stic#51765) ML mappings and index templates have so far been created programmatically. While this had its merits due to static typing, there is consensus it would be clear to maintain those in json files. In addition, we are going to adding ILM policies to these indices and the component for a plugin to register ILM policies is `IndexTemplateRegistry`. It expects the templates to be in resource json files. For the above reasons this commit refactors ML mappings and index templates into json resource files that are registered via `MlIndexTemplateRegistry`. Backport of elastic#51765
#52353) ML mappings and index templates have so far been created programmatically. While this had its merits due to static typing, there is consensus it would be clear to maintain those in json files. In addition, we are going to adding ILM policies to these indices and the component for a plugin to register ILM policies is `IndexTemplateRegistry`. It expects the templates to be in resource json files. For the above reasons this commit refactors ML mappings and index templates into json resource files that are registered via `MlIndexTemplateRegistry`. Backport of #51765
ML mappings and index templates have so far been created
programmatically. While this had its merits due to static typing,
there is consensus it would be clear to maintain those in json files.
In addition, we are going to adding ILM policies to these indices
and the component for a plugin to register ILM policies is
IndexTemplateRegistry
. It expects the templates to be in resourcejson files.
For the above reasons this commit refactors ML mappings and index
templates into json resource files that are registered via
MlIndexTemplateRegistry
.