-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Validate classification dependent_variable cardinality is at lea… #51232
[ML] Validate classification dependent_variable cardinality is at lea… #51232
Conversation
Pinging @elastic/ml-core (:ml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just a few minor comments
@@ -37,9 +37,9 @@ | |||
List<RequiredField> getRequiredFields(); | |||
|
|||
/** | |||
* @return {@link Map} containing cardinality limits for the selected (analysis-specific) fields | |||
* @return {@link List} containing cardinality limits for the selected (analysis-specific) fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/limits/constraints
?
*/ | ||
Map<String, Long> getFieldCardinalityLimits(); | ||
List<FieldCardinalityConstraint> getFieldCardinalityLimits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Limits/Constraints
?
@@ -89,7 +89,7 @@ public void testRequiredFieldsIsEmpty() { | |||
} | |||
|
|||
public void testFieldCardinalityLimitsIsEmpty() { | |||
assertThat(createTestInstance().getFieldCardinalityLimits(), is(anEmptyMap())); | |||
assertThat(createTestInstance().getFieldCardinalityLimits().isEmpty(), is(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a "Matchers.empty()" matcher that could be used here.
@@ -106,7 +106,7 @@ public void testRequiredFieldsIsNonEmpty() { | |||
} | |||
|
|||
public void testFieldCardinalityLimitsIsEmpty() { | |||
assertThat(createTestInstance().getFieldCardinalityLimits(), is(anEmptyMap())); | |||
assertThat(createTestInstance().getFieldCardinalityLimits().isEmpty(), is(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a "Matchers.empty()" matcher that could be used here.
@@ -177,6 +177,7 @@ private void reindexDataframeAndStartAnalysis(DataFrameAnalyticsTask task, DataF | |||
ActionListener<CreateIndexResponse> copyIndexCreatedListener = ActionListener.wrap( | |||
createIndexResponse -> { | |||
ReindexRequest reindexRequest = new ReindexRequest(); | |||
reindexRequest.setRefresh(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is now necessary as we are checking the field cardinality before we call startAnalytics
which refreshes the dest index.
@przemekwitek I have addressed all your points plus fixes a bug regarding refreshing of the dest index which was caught by the tests. |
@@ -492,7 +492,7 @@ private void initialize(String jobId) { | |||
this.jobId = jobId; | |||
this.sourceIndex = jobId + "_source_index"; | |||
this.destIndex = sourceIndex + "_results"; | |||
this.analysisUsesExistingDestIndex = randomBoolean(); | |||
this.analysisUsesExistingDestIndex = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be reverted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes of course.
Please see my comment in |
…st two Data frame analytics classification currently only supports 2 classes for the dependent variable. We were checking that the field's cardinality is not higher than 2 but we should also check it is not less than that as otherwise the process fails.
e51393e
to
eab85d5
Compare
…t lea… (elastic#51232) Data frame analytics classification currently only supports 2 classes for the dependent variable. We were checking that the field's cardinality is not higher than 2 but we should also check it is not less than that as otherwise the process fails. Backport of elastic#51232
…t lea… (elastic#51232) Data frame analytics classification currently only supports 2 classes for the dependent variable. We were checking that the field's cardinality is not higher than 2 but we should also check it is not less than that as otherwise the process fails. Backport of elastic#51232
…st two
Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.