-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ignore_malformed behaviour for unsigned long fields #110045
Fix ignore_malformed behaviour for unsigned long fields #110045
Conversation
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
Hi @salvatore-campagna, I've created a changelog YAML for you. |
@elasticsearchmachine test this please |
refresh: true | ||
index: test-stored | ||
id: "5" | ||
body: { "ul_ignored": [1, { "key": "foo", "value": "bar" }, 3], "ul_not_ignored": 4000 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the case where I see different behaviour between stored and synthetic source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This in an operation against test-stored
which does not exist at this time right. So this is going into dynamic fields code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I fixed it yesterday but didn't push since I was doing some more tests...that is the reason why I saw dynamic mapping kick in...
Note that the yaml test including the error messages and exceptions has the same behaviour we have in |
docs/changelog/110045.yaml
Outdated
@@ -0,0 +1,6 @@ | |||
pr: 110045 | |||
summary: Parsing objects for unsigned long fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's align this with PR title.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change this before I merge.
context.addIgnoredField(mappedFieldType.name()); | ||
if (isSourceSynthetic) { | ||
context.doc().add(IgnoreMalformedStoredValues.storedField(fullPath(), context.parser())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing a return here? Right now it will still throw an exception with ignore malformed enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a closer look and there is a problem that when the source is not synthetic we'll not advance the parser to the end of the object and fail later with parsing exception (found extra data after parsing: END_OBJECT
). I think this is actually a reason that we have parser.currentToken().isValue()
check when handling ignore_malformed
.
It looks like my expectations in #109705 are incorrect and this works as designed - "value fields" like numbers don't take objects as inputs even with ignore_malformed
. If we wanted to change that we probably need a wider group decision. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not expect non-object fields (line keyword or integer) to accept object-like values...but then we need to agree on how to handle them when it comes to ignore_malformed
I see most of our code does not parse objects for things other than object-like types. It looks like ignore_malformed
is more for things like numbers which are not numbers or maybe out of range values and so on...I don't think we can catch all kind of parsing issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other end, anyway, I think the purpose of ignore_malformed
is to avoid documents being rejected because of a a malformed field value. So, probably, the right behaviour would be to just parse the object until the end so to avoid the assertion failure later but then add the field to ignored fields and store its value.
refresh: true | ||
index: test-stored | ||
id: "5" | ||
body: { "ul_ignored": [1, { "key": "foo", "value": "bar" }, 3], "ul_not_ignored": 4000 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This in an operation against test-stored
which does not exist at this time right. So this is going into dynamic fields code path.
@martijnvg @lkts with the last commit I have done it how it should be in my opinion. If a value that is not an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
The approach LGTM too. There may be some overlap here with #12366. |
When an
object
is supplied as a value for a field whose type isunsigned_log
we need to triggerhandling of the field value using our ignore malformed handling strategy. The
IllegalStateException
happens because the parser does not handle an object value being supplied and the try/catch only
deals with
IllegalArgumentException
when adding the malformed value.Resolves #109705