Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Transforms/Data Frame Analytics: Fix freezing wizard for indices with massive amounts of fields. #98259

Merged
merged 10 commits into from
Apr 28, 2021

Conversation

walterra
Copy link
Contributor

@walterra walterra commented Apr 26, 2021

Summary

Fixes #78590.

Inspired by the approach taken by Kibana Discover + Lens.

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.

Checklist

@walterra walterra added bug Fixes for quality problems that affect the customer experience release_note:fix :ml v8.0.0 Feature:Transforms ML transforms v7.13.0 labels Apr 26, 2021
@walterra walterra self-assigned this Apr 26, 2021
@walterra walterra requested a review from a team as a code owner April 26, 2021 09:04
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@peteharverson
Copy link
Contributor

Tested this on a filebeat index with around 1300 fields, and there is a big performance improvement when opening the Transform wizard.

Could the same fix be applied to the data frame analytics wizard, which also suffers from the same issue with the source index preview grid?

Also is there anything we can do for the Transform preview grid, when using a latest transform? I see some delay in rendering the grid, and the field picker, although it isn't as noticeable as the original problem with the source index preview.

@walterra
Copy link
Contributor Author

@peteharverson

  • Improved DFA wizard in 499f5cf.
  • Improved Transform wizard preview in 21e1c2c.

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest edits, and LGTM

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 5.9MB 5.9MB +931.0B
transform 908.6KB 910.1KB +1.4KB
total +2.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
transform 19.5KB 19.5KB +66.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @walterra

Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and LGTM ⚡

@walterra walterra changed the title [ML] Transforms: Fix freezing transform wizard for indices with massive amounts of fields. [ML] Transforms/Data Frame Analytics: Fix freezing wizard for indices with massive amounts of fields. Apr 28, 2021
@walterra walterra merged commit bfb363f into elastic:master Apr 28, 2021
@walterra walterra deleted the ml-transform-fix-slow-data-grid branch April 28, 2021 06:23
walterra added a commit to walterra/kibana that referenced this pull request Apr 28, 2021
… with massive amounts of fields. (elastic#98259)

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.
walterra added a commit to walterra/kibana that referenced this pull request Apr 28, 2021
… with massive amounts of fields. (elastic#98259)

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.
walterra added a commit that referenced this pull request Apr 28, 2021
… with massive amounts of fields. (#98259) (#98571)

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.
walterra added a commit that referenced this pull request Apr 28, 2021
… with massive amounts of fields. (#98259) (#98572)

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.
walterra added a commit to walterra/kibana that referenced this pull request Apr 28, 2021
… with massive amounts of fields. (elastic#98259)

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.
walterra added a commit that referenced this pull request Apr 28, 2021
… with massive amounts of fields. (#98259) (#98573)

The transform wizard can become very slow when used with indices with e.g. 1000+ fields.

This PR fixes it by prefetching 500 random documents to create a list of populated/used fields and passes those on to the data grid component instead of all available fields from the list derived via Kibana index patterns.

For example, for an out of the box metricbeat index, this reduces the list of passed on fields from 3000+ to ~120 fields. Previously, the page would freeze on load for tens of seconds and would freeze again on every rerender. With the applied update, the page loads almost instantly again and remains responsive.

Note this fix of reducing available fields is only applied to the data grid preview component. All fields are still available to create the configuration in the UI for groups and aggregations. These UI components, e.g. the virtualized dropdowns, can handle large lists of fields.
@walterra walterra added the Feature:Data Frame Analytics ML data frame analytics features label Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Data Frame Analytics ML data frame analytics features Feature:Transforms ML transforms :ml release_note:fix v7.12.2 v7.13.0 v7.14.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Transforms/DFA: Wizard performance degrades with indices with a lot of fields
6 participants