Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow arrays and concrete values in flat object mapping types #8250

Open
jgough opened this issue Jun 26, 2023 · 3 comments
Open

Allow arrays and concrete values in flat object mapping types #8250

jgough opened this issue Jun 26, 2023 · 3 comments
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc

Comments

@jgough
Copy link

jgough commented Jun 26, 2023

When I don't have full control of values I'm going to be adding to OpenSearch I might want use the new flat object type to support whatever type we receive. This works for nested JSON objects such as:
{"desc":true}

However it fails with other valid JSON values such as:
["field1","field2"]
"desc"
-1
true

When trying to add a field of this type it fails even though these are valid JSON objects. With an array the error is

    "type": "mapper_parsing_exception",
    "reason": "failed to parse field [sort] of type [flat_object] in document with id '1'. Preview of field's value: 'field1'",
    "caused_by": { 
      "type": "null_pointer_exception",
      "reason": "Cannot invoke \"java.lang.CharSequence.toString()\" because \"s\" is null"
    }

Or in the case of a concrete value:
object mapping for [_doc] tried to parse field [sort] as object, but got EOF, has a concrete value been provided to it?

Ideally, I'd like flat_object mappings to support all valid JSON types including arrays, strings, integers and booleans. If there are no plans to support these then the documentation may benefit from highlighting these limitations.

@jgough jgough added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 26, 2023
@anasalkouz anasalkouz added the Search Search query, autocomplete ...etc label Jun 27, 2023
@lukas-vlcek
Copy link
Contributor

Hi,

if I am not mistaken the JSON (as a "lightweight data-interchange format") should be represented in two basic forms (excerpt from https://www.json.org/json-en.html):

JSON is built on two structures:

  • A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
  • An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

In other words it is either a hash-map like object {...} or an array [...].

This rules out all the other cases like: "desc", -1, true

Currently the flat_object supports hash-map type objects. I do not think array like structures are supported (I might be wrong here). However, it should be made clear in the documentation if it is not now.

As for the second part of your question: Would it be possible to make it more general?

Technically, it should be possible to wrap any non hash-map like object (yet still valid JSON value) into elemental hash-like object, for example:

true -> { "root": true }
-1 -> { "root": -1 }
"desc" -> { "root": "desc" }

But I am not sure if flat_object should do this conversion automatically.

IMO such data transformation should happen in an Ingest pipeline (see Ingest API) which precedes the indexing operation. There are many Ingest processors that can help with this task (unfortunately OpenSearch docs about Ingest processors seem to be missing some processors, so I would point you to Elasticsearch 7.10 documentation about Ingest processors for now). In any case, you will need to get a bit creative when setting up such ingest job but I believe it can get you quite far.

@ryn9
Copy link

ryn9 commented Jan 22, 2024

Wanted to note that I mentioned this issue in another issue:
#7137 (comment)

My suggestion is to at least support the "ignore_malformed" option for flat_object to allow the document to index, but ignore the field.
IE - the field would not be indexed, and added the the "_ignored" array

@jgough
Copy link
Author

jgough commented Jan 22, 2024

@lukas-vlcek The JSON spec page you link very specifically defines the JSON grammar as:

json: consists of element
element: consists of ws + value + ws (value surrounded by whitespace)
value: any of object, array, string, number, true, false, null

So even though it is built on objects and arrays, JSON clearly permits simple values such as my examples above.

@getsaurabh02 getsaurabh02 moved this from 🆕 New to Later (6 months plus) in Search Project Board Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

6 participants