Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More details on exception when having a nested field with date_range and include_in_root set #89164

Open
ibotello opened this issue Aug 8, 2022 · 9 comments
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@ibotello
Copy link
Contributor

ibotello commented Aug 8, 2022

Elasticsearch Version

7.14.2

Installed Plugins

No response

Java Version

bundled

OS Version

ESS

Problem Description

We get an exception when we have multiple nested documents all trying to add a binary doc values field to the root doc.
This exception appears since Lucene only lets you add a single instance of a binary doc values field to a document.
Would be nice to have a more detailed exception message to actually understand what is happening.

Steps to Reproduce

Creating a sample index with the nested type, and having a date_range field:

PUT my-index-00001
{
  "mappings": {
    "properties": {
      "NestedField": {
        "type": "nested",
        "include_in_root": true,
        "properties": {
          "DateField": {
            "type": "date_range"
          }
        }
      }
    }
  }
}

Trying to ingest one document:

PUT my-index-00001/_doc/1
{
  "NestedField": [
    {
      "DateField": {
        "gte": "2015-10-31",
        "lte": "2015-10-31"
      }
    },
    {
      "DateField": {
        "gte": "2016-10-31",
        "lte": "2016-10-31"
      }
    }
  ]
}

We get the next exception:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "DocValuesField \"NestedField.DateField\" appears more than once in this document (only one value is allowed per field)"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "DocValuesField \"NestedField.DateField\" appears more than once in this document (only one value is allowed per field)"
  },
  "status" : 400
}

Logs (if relevant)

No response

@ibotello ibotello added >bug needs:triage Requires assignment of a team area label labels Aug 8, 2022
@iverase iverase added :Search/Search Search-related issues that do not fall into other categories and removed needs:triage Requires assignment of a team area label labels Aug 9, 2022
@josefschiefer27
Copy link

I am getting running into the same issue using other data types (in my case with wildcard - interestingly keyword works - see below)

For instance, if I create an index with

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested",
        "include_in_root": true,
        "properties": {
          "first": {
            "type": "keyword"
          },
          "last": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

With this index the following ingestion will work:

PUT my-index-000001/_doc/1
{
  "user": [
      { "first" : "Bob", "last" : "Kesler" },
      { "first" : "Robert", "last" : "Maxim" }
    ]
}

However, if I change the data type to wildcard

PUT my-index-000001
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested",
        "include_in_root": true,
        "properties": {
          "first": {
            "type": "wildcard"
          },
          "last": {
            "type": "wildcard"
          }
        }
      }
    }
  }
}

... the same ingestion command will fail with the error "DocValuesField "user.first" appears more than once in this document (only one value is allowed per field)".

PUT my-index-000001/_doc/1
{
  "user": [
      { "first" : "Bob", "last" : "Kesler" },
      { "first" : "Robert", "last" : "Maxim" }
    ]
}

The interesting thing is that the following command works:

PUT my-index-000001/_doc/1
{
  "user": [
      { "first" : "Bob", "last" : "Kesler" }
    ]
}

So it looks like there is only an issue when there is an array of nested objects using include_in_root or include_in_parent.

@josefschiefer27
Copy link

Any updates about this issue? It would be nice to understand the root cause of this issue. The described behavior is a bug that only surfaces with certain data types when using nested with include_in_root or include_in_parent.

@patrick-radius
Copy link

I'm having this issue too. it seems both include_in_root and include_in_parent behave the same.

@lmignon
Copy link

lmignon commented Dec 14, 2022

Any update about this issue? I'm facing the same one and it would be nice to know why and how we can fix it...

@wchaparro wchaparro added the Team:Search Meta label for search team label Dec 14, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@romseygeek
Copy link
Contributor

This is the same issue as #70261. Field types that store information in a lucene binary doc values field don't work with include_in_root or include_in_parent because each child will try and store its information on the parent document separately, and lucene only allows a single binary doc values field instance per doc.

We have a couple of options I think:

  • detect this up-front in the mappings and throw an error at index creation / template validation
  • update our indexing code to merge multiple binary fields together into a single value on the parent doc; this would obviously be preferable for end-users as it would mean everything would Just Work, but may not be supportable for all field types

@josefschiefer27
Copy link

josefschiefer27 commented Dec 16, 2022

Would lucene issue #11702 fix this problem?

@josefschiefer27
Copy link

josefschiefer27 commented Feb 18, 2023

@romseygeek any plans to fix this issue? As it stands right now this is broken functionality for include_in_root and include_in_parent for certain data types.

@javanna javanna added the priority:normal A label for assessing bug priority to be used by ES engineers label Jun 13, 2024
@benwtrent benwtrent added :Search Foundations/Mapping Index mappings, including merging and defining field types and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 12, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 12, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine removed the Team:Search Meta label for search team label Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

10 participants