Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Categorization wizard #53009

Merged
merged 29 commits into from
Jan 9, 2020

Conversation

jgowdyelastic
Copy link
Member

@jgowdyelastic jgowdyelastic commented Dec 13, 2019

Adds a wizard for creating categorization jobs.

2020-01-09 09-29-57 2020-01-09 09_36_24

User can select a Count or Rare detector and the categorization field from a list text and keyword fields.
When a field is selected, a list of example field values is shown with highlighted tokens based on the default analyser.
image
The analyzer can be edited by the user which will reload these examples.

A validation message is also displayed to show the results of the token counts per field value.
100 values are checked and a percentage show based on how many have more than one token.
If this percentage is less than 75%, the user is warned that field might not be good for categorization.
image

If the percentage is less than 2%, the use of that field is not allowed.
image

When running the job, an event rate chart is shown with anomalies.
image

Along with a list of example categories. If model plot is enabled, this list will also have a count per category and be sorted by the most popular category.
image

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

For maintainers

@jgowdyelastic jgowdyelastic force-pushed the categorization-wizard branch 3 times, most recently from 651d7db to 4753401 Compare December 20, 2019 13:23
@elasticmachine
Copy link
Contributor

💔 Build Failed

History

  • 💔 Build #16928 failed 29eb049a79a9b9e9a65edf5258b62b797f6c0662
  • 💔 Build #16898 failed 47534015614e34134dcbf1ff7ee695113b394340
  • 💔 Build #16422 failed 2f91dbde24909a9fe11605e231e25585a7df39ea
  • 💔 Build #15680 failed 4f13dedb24008dcbfff6420567003aef692e4ac2
  • 💔 Build #15466 failed 248836e12e810cc7817116f9015bb45da3aaff66

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@jgowdyelastic jgowdyelastic force-pushed the categorization-wizard branch 4 times, most recently from 076162f to f8a975e Compare January 7, 2020 13:57
@jgowdyelastic jgowdyelastic marked this pull request as ready for review January 7, 2020 16:16
@jgowdyelastic jgowdyelastic requested a review from a team as a code owner January 7, 2020 16:16
@jgowdyelastic jgowdyelastic self-assigned this Jan 7, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@jgowdyelastic jgowdyelastic requested review from walterra and removed request for Jeremy-Walters January 8, 2020 12:25
@jgowdyelastic jgowdyelastic force-pushed the categorization-wizard branch from 01a4b21 to f6fbc8f Compare January 8, 2020 13:52
</h3>
</EuiTitle>
<EuiSpacer size="s" />
<EuiFlexGroup gutterSize="l" style={{ maxWidth: '824px' }}>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we replace this inline style?

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text for the categorization field should be edited - removing the first part about it being optional. Maybe something like 'Specifies which field will be categorized. Using text data types is recommended.'

image

@jgowdyelastic jgowdyelastic force-pushed the categorization-wizard branch from 08054ef to 80b802f Compare January 9, 2020 13:52
Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest edits LGTM

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

History

  • 💚 Build #18943 succeeded 08054ef53e4d7ace84d178ce0f7a091239bc3925
  • 💚 Build #18924 succeeded 218447375720a369acd2969ddffa661400c89fef
  • 💚 Build #18785 succeeded ba099a54a40235e5f10c9eaa49bb5bf16ceaec19
  • 💔 Build #18759 failed e0ba6efbb07612236bafbb3b05a106e78eb0cf34
  • 💔 Build #18743 failed b7ad6cfb2afbc69e4e49846c2c1221f067c206b3

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@jgowdyelastic jgowdyelastic merged commit 36abed3 into elastic:master Jan 9, 2020
@jgowdyelastic jgowdyelastic deleted the categorization-wizard branch January 9, 2020 15:21
jgowdyelastic added a commit to jgowdyelastic/kibana that referenced this pull request Jan 9, 2020
* [ML] Categorization wizard

* fixing js prettier issues

* adding basic category field validation

* adding rare or count selection

* fixing types

* category examples changes

* improving results search

* adding analyzer editing

* improving callout

* updating callout text

* fixing import path

* resetting cat analyser json on flyout open

* disabling model plot by default

* minor refactoring

* fixing types

* hide estimate bucket span

* setting default bucket span

* removing ml_classic workaround

* changing style of detector selection

* fixing convert to advanced issue

* removing sparse data checkbox

* changes based on review

* use default mml

* fixing job cloning

* changes based on review

* removing categorization_analyzer from job if it is same as default

* fixing translations

* disabling model plot for rare jobs

* removing console.error in useResolver
jgowdyelastic added a commit that referenced this pull request Jan 9, 2020
* [ML] Categorization wizard

* fixing js prettier issues

* adding basic category field validation

* adding rare or count selection

* fixing types

* category examples changes

* improving results search

* adding analyzer editing

* improving callout

* updating callout text

* fixing import path

* resetting cat analyser json on flyout open

* disabling model plot by default

* minor refactoring

* fixing types

* hide estimate bucket span

* setting default bucket span

* removing ml_classic workaround

* changing style of detector selection

* fixing convert to advanced issue

* removing sparse data checkbox

* changes based on review

* use default mml

* fixing job cloning

* changes based on review

* removing categorization_analyzer from job if it is same as default

* fixing translations

* disabling model plot for rare jobs

* removing console.error in useResolver
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants