-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Update .ml-config mappings before indexing job, datafeed or df analytics config #44216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update .ml-config mappings before indexing job, datafeed or df analytics config #44216
Conversation
|
Pinging @elastic/ml-core |
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this so quickly.
My only code comments are minor (and 3 are the same thing).
I think the really important thing is to validate this using an index created in a 6.6 cluster. You won't be able to do this on the master branch.
One way to do it would be:
- Create the 7.3 backport branch now and cherry-pick the commits from your master PR branch to that branch hanging off of 7.3.
- Download a 6.6 snapshot build, for example from https://prelert-artifacts.s3-eu-west-1.amazonaws.com/maven/org/elasticsearch/ml/ml/6.6.3-SNAPSHOT/elasticsearch-6.6.3-SNAPSHOT.zip
- Extract and run the 6.6 snapshot, and create an ML job (anything will do)
- Shut down the ES 6.6 process
- With the branch from step 1 checked out, run
./gradlew :distribution:archives:darwin-tar:buildto build a snapshot distribution of that branch - Extract the 7.3 snapshot you just built and move the
datadirectory from the 6.6 snapshot install into the 7.3 install - Start up the 7.3 install and create an ML data frame analytics job (anything will do)
- Check the mappings on the
.ml-configindex - have they been updated with the correct mappings for data frame analytics jobs?
In particular it's critical that n_neighbors is an integer and feature_influence_threshold is a double. If the fix didn't work as expected they'll be keyword.
Once you've seen manually that the fix works it should be possible to automate it by adding a new test in x-pack/qa/full-cluster-restart/src/test/java/org/elasticsearch/xpack/restart. We already have MlMigrationFullClusterRestartIT but you could add a new one that:
- Creates an anomaly detector job in the old cluster
- Creates a data frame analytics job in the new cluster
- Asserts on the new cluster mappings after the data frame analytics job is successfully created
However, this test will only validate the 6.6 to 7.3 case when run on the 7.3 branch. When run on the master branch it will validate a 7.x to 8.0 upgrade, which will not currently prove that the mappings updater code is working. This is why I think it's so crucial to do a manual test with 6.6 and 7.3 before merging this PR.
...ck/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutDatafeedAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/job/JobManager.java
Outdated
Show resolved
Hide resolved
...ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutDataFrameAnalyticsAction.java
Outdated
Show resolved
Hide resolved
...ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutDataFrameAnalyticsAction.java
Outdated
Show resolved
Hide resolved
przemekwitek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I applied the suggested fixes and started manual testing process.
...ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutDataFrameAnalyticsAction.java
Outdated
Show resolved
Hide resolved
...ck/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutDatafeedAction.java
Outdated
Show resolved
Hide resolved
...ml/src/main/java/org/elasticsearch/xpack/ml/action/TransportPutDataFrameAnalyticsAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/job/JobManager.java
Outdated
Show resolved
Hide resolved
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming the 6.6->7.3 test works
|
run elasticsearch-ci/default-distro |
|
run elasticsearch-ci/1 |
83538c3 to
fcf4e1a
Compare
fcf4e1a to
f2411ec
Compare
Log warning if it was impossible to obtain cluster state before updating mappings
…MappingsListener This fixes testCreateJob_WithClashingFieldMappingsFail tests
davidkyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I've verified manually that the .ml-config has correct mappings after the upgrade. |
This PR updates .ml-config index mappings before indexing job, datafeed or df analytics config.
Without it, if the .ml-config index was created in the old version, and the first config with a new field is indexed, the default mapping (i.e. type "keyword") is applied for this new field as implemented in ElasticsearchMappings::addDefaultMapping. This default behavior is usually not desirable.
Related to #44263