[ML] Improve model_memory_limit UX for data frame analytics jobs #44699

droberts195 · 2019-07-22T14:25:34Z

The current workflow for memory limits is:

User specifies a memory limit up front with no guidance as to what is sensible/realistic
They start the analysis
Analysis chugs away for a while reindexing the source index into the destination index
C++ process is started and checks whether the memory limit is sufficient to do the analysis
If it's not then the process exits, reporting how much memory was required in the error message

Two possible ways to solve it are:

We duplicate that logic that calculates the memory requirement from the C++ code into the Java and UI code
We add a mode of operation to the C++ process where you just supply the spec and instead of actually doing the analysis it just tells you:
i. What you'd have to set the memory limit to to do the analysis entirely in RAM
ii. What the minimum memory limit is that will enable the analysis to run at all (using disk)

Although option 2 is a lot of work, the memory calculations done by the C++ are now so complex that it is impractical to duplicate them. Therefore we should implement option 2. The work will be something along the lines of:

Add a mode of operation to the C++ process where you just supply the spec and it returns the sizing information (i and ii above)
Introduce memory usage estimation mode in data_frame_analyzer ml-cpp#584
Add an endpoint to the Java code that runs the C++ process in this mode and returns the output in the endpoint response
Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint #45188
Call this endpoint early in the start data frame analytics endpoint sequence (before creating the persistent task) and fail early if the configured model_memory_limit is so low that the C++ process will fail immediately when run
Call the new _estimate_memory_usage API endpoint on df analytics _start #45536
Call the endpoint as part of the configuration process in the UI so that the UI can present a sensible default model_memory_limit, advise on sensible upper and lower bounds and validate any value the user chooses themselves
[ML] Improve UX regarding df analytics model memory limit kibana#43740
HLRC
HLRC for memory usage estimation API #45531
docs
Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint #45188 (API reference), Add docs for HLRC for Estimate memory usage API #45538 (HLRC)

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-07-22T14:25:35Z

Pinging @elastic/ml-core

droberts195 added >enhancement :ml Machine learning v7.4.0 labels Jul 22, 2019

przemekwitek self-assigned this Jul 29, 2019

This was referenced Aug 5, 2019

Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint #45188

Merged

Introduce memory usage estimation mode in data_frame_analyzer elastic/ml-cpp#584

Merged

przemekwitek mentioned this issue Aug 22, 2019

[ML] Improve UX regarding df analytics model memory limit elastic/kibana#43740

Closed

colings86 added v7.5.0 and removed v7.4.0 labels Aug 30, 2019

jimczi added v7.6.0 and removed v7.5.0 labels Nov 12, 2019

przemekwitek closed this as completed Nov 19, 2019

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

ChrisHegarty unassigned przemekwitek Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Improve model_memory_limit UX for data frame analytics jobs #44699

[ML] Improve model_memory_limit UX for data frame analytics jobs #44699

droberts195 commented Jul 22, 2019 •

edited by przemekwitek

Loading

elasticmachine commented Jul 22, 2019

[ML] Improve model_memory_limit UX for data frame analytics jobs #44699

[ML] Improve model_memory_limit UX for data frame analytics jobs #44699

Comments

droberts195 commented Jul 22, 2019 • edited by przemekwitek Loading

elasticmachine commented Jul 22, 2019

droberts195 commented Jul 22, 2019 •

edited by przemekwitek

Loading