Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object properties order not preserved in _source for fields with "type": "object", "enabled": false #119347

Open
nemphys opened this issue Dec 30, 2024 · 6 comments
Labels
needs:triage Requires assignment of a team area label

Comments

@nemphys
Copy link

nemphys commented Dec 30, 2024

Elasticsearch Version

8.15.2

Installed Plugins

analysis_icu

Java Version

bundled

OS Version

MacOS

Problem Description

According to the documentation (https://www.elastic.co/guide/en/elasticsearch/reference/current/enabled.html), fields mapped as "type": "object", "enabled": false are supposedly not processed/parsed by Elasticsearch and therefore one would expect that any stored objects in such fields would be stored in the document _source exactly as sent in the indexing request.

This does not seem to be the case when it comes to the order of their properties, since we have cases where such fields inside the document _source (retrieved using a plain GET document request, no searching involved) do not retain the original order.

Eg. a document indexed like this:

{
   fieldA: {
      propA: valueA,
      propB: valueB,
      propC: valueC
   }
}

is returned like this right after it is indexed:

{
   _source: {   
      fieldA: {
          propC: valueC,
          propA: valueA,
          propB: valueB
       }
   }
}

Is this normal/to be expected (ie. object serialization in source fields is not guaranteed to preserve properties order), or is it a bug?

Steps to Reproduce

PUT /test
{
  "mappings": {
    "_source": {
      "enabled": true,
      "excludes": [
        "*.test"
      ]
    },
    "properties": {
      "content": {
        "type": "object",
        "enabled": false
      }
    }
  }
}

PUT /test/_doc/12345678
{
  "content": {
    "propA": [
      "valueA"
    ],
    "propB": [
      "valueB"
    ],
    "propC": [
      "valueC"
    ],
    "propD": [
      "valueD"
    ],
    "propE": [
      "valueE"
    ],
    "propF": [
      "valueF"
    ],
    "propG": [
      "valueG"
    ],
    "propH": [
      "valueH"
    ],
    "propI": [
      "valueI"
    ]
  }
}

GET /test/_doc/12345678

Logs (if relevant)

No response

@nemphys nemphys added >bug needs:triage Requires assignment of a team area label labels Dec 30, 2024
@astefan
Copy link
Contributor

astefan commented Dec 30, 2024

Let's first establish that this is actually a bug in theory and then I can provide reproduction steps 😃

Just testing this as you described it, it doesn't reproduce. Unless you provide a reproduceable scenario (complete mapping and settings, exact document indexed and commands showing the mangled _source) I cannot confirm this as a bug. Our documentation provides a similar to what you described behavior for synthetic source otherwise _source should be as it was when the document was indexed.
Also, next time please provide a reproduceable scenario with the described bug. Github is reserved for actual issues, all other types of questions should be posted on our forum. Consider reopening this issue with full list of steps that reproduces the described behavior.

@astefan astefan closed this as completed Dec 30, 2024
@astefan astefan removed the >bug label Dec 30, 2024
@nemphys
Copy link
Author

nemphys commented Dec 30, 2024

@astefan I just wanted to make sure that this is not considered normal. If you reopen this I can provide a working example.

@astefan astefan reopened this Dec 30, 2024
@nemphys
Copy link
Author

nemphys commented Dec 30, 2024

@astefan thank you, will update the issue later tonight.

@nemphys
Copy link
Author

nemphys commented Dec 30, 2024

@astefan after a little digging, I have discovered the culprit: this only happens when the index mapping for _source has a "excludes" property.

I have updated the issue with a reproducible example, you will see that after the last GET request, the properties of the "content" field object in the document source are in a seemingly random order.

Please inform me if you want me to change the title to something more descriptive, now that I have narrowed down the cause of the issue.

@nemphys
Copy link
Author

nemphys commented Dec 30, 2024

After further testing, it seems to happen only when the excludes property contains a wildcard 😃

@nemphys
Copy link
Author

nemphys commented Dec 30, 2024

And one final finding is that this affects not only the property order of the nested objects (like the one in the example), but rather of the whole source object and all nested objects.

Therefore I think that we should rewrite the issue title and description from scratch, since it seems that it has nothing to do with enabled: false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs:triage Requires assignment of a team area label
Projects
None yet
Development

No branches or pull requests

2 participants