Skip to content

Conversation

@dimitris-athanasiou
Copy link
Contributor

…… (#44944)

As data frame rows with missing values for analyzed fields are skipped,
we can be more efficient by including a query that only picks documents
that have values for all analyzed fields. Besides improving the number
of documents we go through, we also provide a more accurate measurement
of how many rows we need which reduces the memory requirements.

This also adds an integration test that runs outlier detection on data
with missing fields.

elastic#44944)

As data frame rows with missing values for analyzed fields are skipped,
we can be more efficient by including a query that only picks documents
that have values for all analyzed fields. Besides improving the number
of documents we go through, we also provide a more accurate measurement
of how many rows we need which reduces the memory requirements.

This also adds an integration test that runs outlier detection on data
with missing fields.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@dimitris-athanasiou
Copy link
Contributor Author

@elasticmachine update branch

@dimitris-athanasiou dimitris-athanasiou merged commit cb25cf8 into elastic:7.3 Aug 1, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the outlier-detection-should-query-docs-that-have-all-analyzed-fields-7_3 branch August 1, 2019 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants