Skip to content

Conversation

@salvatore-campagna
Copy link
Contributor

@salvatore-campagna salvatore-campagna commented Sep 16, 2024

Flattened fields do no actually use ignore_above. When retrieving source, ignore_above behaves
as expected. Anyway, it is ignored when fetching field values through the fields API. In that case
values are returned no matter the ignore_above setting. Here we implement the filtering logic
in the ValueFecther implementation of RootFlattenedFieldType.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)


if (entryValue instanceof List<?> valueAsList) {
for (Object value : valueAsList) {
if (value instanceof String valueAsString && valueAsString.length() < ignoreAbove) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can these be, other than String? Shall we assert?

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to be extremely defensive...my expectation is that here we have String or List<String> but actually I see Object and List<Object>. Strings are objects, of course, but I need to cast them to count the number of characters and filter. There are also cases, anyway, where non-strings are expected. See for instance testFlattenedField in FieldFetcherTests. So, here we apply ignoreAbove filtering only on strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is that you want to assert? I still need to do the cast anyway...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that example, I see that we have a number so it's fine to ignore. I wonder if there are cases where we can have mixed singletons (e.g. strings) with maps pointing to more strings that can be ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added more tests covering cases I could imagine...I don't see anything missing.


assertEquals(List.of(), fetchSourceValue(createDefaultFieldType(10), sourceValue));
assertEquals(List.of(sourceValue), fetchSourceValue(createDefaultFieldType(20), sourceValue));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some more elaborate unit tests, these are easier to debug.

"key3",
"hi",
"key4",
List.of("the quick brown fox", "jumps over the lazy dog")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add "key5", Map.of("nested", List.of("foo", "foobar")?

We can have subobjects, iiuc.

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, I would expect a case like the one you described to be handled like

"key5.nested" -> ["foo", "foobar"] 

because that is how flattening works, right?

I don't see any way for a user to put a map into a field.

key5.field1: foo
key5.field2: bar
key5.nested: ["foo", "foobar"]

which in json is

key5: {
   field1: foo,
   field2: bar,
   nested: ["foo", "foobar"]
}

which is flattened as described above.
In a flattened field all values for each field are leaves...so non-nested stuff, that is either a single-value or a multi-value.

I mean we can do that in a test...but a user would never be able to do so...am I wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense.. Let's add this to yaml for peace of mind, to verify the expected behavior.

lkts pushed a commit to lkts/elasticsearch that referenced this pull request Sep 20, 2024
Flattened fields do no actually use `ignore_above`. When retrieving source, `ignore_above` behaves
as expected. Anyway, it is ignored when fetching field values through the fields API. In that case
values are returned no matter the `ignore_above` setting. Here we implement the filtering logic
in the `ValueFecther` implementation of `RootFlattenedFieldType`.

(cherry picked from commit 91674f8)
@lkts
Copy link
Contributor

lkts commented Sep 20, 2024

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Sep 20, 2024
* Fix `ignore_above` for flattened fields (#112944)

Flattened fields do no actually use `ignore_above`. When retrieving source, `ignore_above` behaves
as expected. Anyway, it is ignored when fetching field values through the fields API. In that case
values are returned no matter the `ignore_above` setting. Here we implement the filtering logic
in the `ValueFecther` implementation of `RootFlattenedFieldType`.

(cherry picked from commit 91674f8)

* Fix

---------

Co-authored-by: Salvatore Campagna <93581129+salvatore-campagna@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants