Skip to content

Commit

Permalink
[DOCS] Adds item about fields containing arrays to anomaly detection …
Browse files Browse the repository at this point in the history
…limitations (elastic#1651) (elastic#1683)

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
  • Loading branch information
szabosteve and lcawl authored Jun 2, 2021
1 parent efcbae2 commit 33d083c
Showing 1 changed file with 33 additions and 4 deletions.
37 changes: 33 additions & 4 deletions docs/en/stack/ml/anomaly-detection/ml-limitations.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,31 @@ You cannot use the following field names in the `by_field_name` or
`over_field_name` properties in a job: `by`; `count`; `over`. This limitation
also applies to those properties when you create advanced jobs in {kib}.


[discrete]
[[ml-arrays-limitations]]
=== Arrays in analyzed fields are turned into comma-separated strings

If an {anomaly-job} is configured to analyze an aggregatable field (a field that
is part of the index mapping definition), and this field contains an array, then
the array is turned into a comma-separated concatenated string. The items in the
array are sorted alphabetically and the duplicated items are removed. For
example, the array `["zebra", "dog", "cat", "alligator", "cat"]` becomes
`alligator,cat,dog,zebra`. The Anomaly Explorer charts don't display any results
for the job as the string does not exist in the source data. The Single Metric
Viewer displays results if the model plot is enabled.

If an array field is not aggregatable and is retrieved from `_source`, the array
is also turned into a comma-separated, concatenated list. However, the list
items are not sorted alphabetically, nor are they deduplicated. Taking the
example above, the comma-separated list, in this case, would be
`zebra,dog,cat,alligator,cat`.

Analyzing large arrays results in long strings which may require more system
resources. Consider using a query in the {dfeed} that filters on the relevant
items of the array.


[discrete]
[[ml-frozen-limitations]]
=== Frozen indices are not supported
Expand Down Expand Up @@ -109,10 +134,11 @@ For more information about any of these functions, see <<ml-functions>>.
[[ml-limitations-runtime]]
=== {anomaly-detect-cap} performs better on indexed fields

{anomaly-jobs-cap} sort all data by a user-defined time field, which is frequently
accessed. If the time field is a {ref}/runtime.html[runtime field], the
performance impact of calculating field values at query time can significantly slow
the job. Use an indexed field as a time field when running {anomaly-jobs}.
{anomaly-jobs-cap} sort all data by a user-defined time field, which is
frequently accessed. If the time field is a {ref}/runtime.html[runtime field],
the performance impact of calculating field values at query time can
significantly slow the job. Use an indexed field as a time field when running
{anomaly-jobs}.


[discrete]
Expand Down Expand Up @@ -144,6 +170,7 @@ you send to the job must use the JSON format.
For more information about this API, see
{ref}/ml-post-data.html[Post Data to Jobs].


[discrete]
=== Misleading high missing field counts
//See x-pack-elasticsearch/#684
Expand Down Expand Up @@ -288,6 +315,7 @@ To avoid this behavior, make sure that the aggregation interval in the {dfeed}
configuration and the bucket span in the {anomaly-job} configuration have the
same values.


[discrete]
[[ml-space-limitations]]
=== Calendars and filters are visible in all {kib} spaces
Expand All @@ -298,6 +326,7 @@ that belong to your space. However, this limited scope does not apply to
<<ml-calendars,calendars>> and <<ml-rules,filters>>; they are visible in all
spaces.


[discrete]
[[ml-rollup-limitations]]
=== Rollup indices and index patterns are not supported in {kib}
Expand Down

0 comments on commit 33d083c

Please sign in to comment.