Skip to content

Can't create anomaly detector that uses "_index" as "partition_field_name" #39406

@tsouza

Description

@tsouza

Elasticsearch version (bin/elasticsearch --version):

Version: 6.5.0, Build: default/tar/816e6f6/2018-11-09T18:58:36.352602Z, JVM: 10.0.1

Plugins installed: []

JVM version (java -version):

java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

OS version (uname -a if on a Unix-like system):

Darwin ElasticMBP 18.2.0 Darwin Kernel Version 18.2.0: Thu Dec 20 20:46:53 PST 2018; root:xnu-4903.241.1~1/RELEASE_X86_64 x86_64

Description of the problem including expected versus actual behavior:
It is not possible to create an anomaly detector that uses _index as partition_field_name. This can be a valid use case if one wants to find anomalies in document count per index, for instance.

The workaround is to use another field to carry the index name. The field can be created by either reindexing or creating a field alias.

Steps to reproduce:

  1. Try to create an anomaly detector with:
PUT _xpack/ml/anomaly_detectors/test
{
  "description": "",
  "established_model_memory": 218234,
  "analysis_config": {
    "bucket_span": "15m",
    "detectors": [
      {
        "detector_description": "low count",
        "function": "low_count",
        "partition_field_name": "_index",
        "detector_index": 0
      }
    ]
  },
  "analysis_limits": {
    "model_memory_limit": "100mb",
    "categorization_examples_limit": 4
  },
  "data_description": {
    "time_field": "@timestamp",
    "time_format": "epoch_ms"
  },
  "results_index_name": "shared"
}
  1. Results in:
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Field [_index] is defined twice in [doc]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Field [_index] is defined twice in [doc]"
  },
  "status": 400
}

Provide logs (if relevant):

[2019-02-26T13:44:27,696][DEBUG][o.e.c.m.MetaDataCreateIndexService] [gzsj5Rv] [.ml-anomalies-shared] failed to create
java.lang.IllegalArgumentException: Field [_index] is defined twice in [doc]
	at org.elasticsearch.index.mapper.MapperMergeValidator.lambda$checkFieldUniqueness$0(MapperMergeValidator.java:86) ~[elasticsearch-6.5.0.jar:6.5.0]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1492) ~[?:?]
	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:734) ~[?:?]
	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) ~[?:?]
	at org.elasticsearch.index.mapper.MapperMergeValidator.checkFieldUniqueness(MapperMergeValidator.java:81) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.index.mapper.MapperMergeValidator.validateMapperStructure(MapperMergeValidator.java:58) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.index.mapper.MapperService.internalMerge(MapperService.java:479) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.index.mapper.MapperService.internalMerge(MapperService.java:399) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:323) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$IndexCreationTask.execute(MetaDataCreateIndexService.java:451) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:639) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:268) ~[elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:198) [elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:133) [elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) [elasticsearch-6.5.0.jar:6.5.0]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) [elasticsearch-6.5.0.jar:6.5.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:844) [?:?]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions