Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML][Inference] Fixing pre-processor value handling and size estimate #49270

Conversation

benwtrent
Copy link
Member

There is a bug right now when handling numerics that are treated as categorical values. This is a valid use as encoding semi-ordinal numbers 10, 50, 5000 categorically can improve model performance instead of simply treating them as numerics.

Additionally, when estimating the size of the pre-processor HashMaps, Double entries were estimated with the default value of 256 when it should have been a simple shallowSizeOf. Setting defSize in the sizeOfMap to 0 corrects this.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit 9360dc9 into elastic:master Nov 22, 2019
@benwtrent benwtrent deleted the feature/ml-inference-pre-processor-bug-fixes branch November 22, 2019 12:31
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Nov 22, 2019
…elastic#49270)

* [ML][Inference] Fixing pre-processor value handling and size estimate

* fixing npe
benwtrent added a commit that referenced this pull request Nov 22, 2019
…#49270) (#49489)

* [ML][Inference] Fixing pre-processor value handling and size estimate

* fixing npe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants