-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] Enable date detection for data stream naming scheme by default #109381
Comments
Pinging @elastic/es-data-management (Team:Data Management) |
My expectation was that date detection is implicitly disabled anyway when using With #112444 we extended our test coverage and our dynamic mappings coverage to verify that all current and future ECS fields are supported when date detection is explicitly disabled. As you say, there are multiple options for field name patterns to choose from and if nothing specifically meets one's need for a field name, I believe |
I'm not certain that this is a bug. Seems more like a deliberate decision that date (and numeric) detection has a higher precedence than dynamic templates. Certainly, changing the precedence would be a breaking change. See also elasticsearch/server/src/main/java/org/elasticsearch/index/mapper/DynamicFieldsBuilder.java Lines 83 to 105 in 0aff606
I do see benefits of enabling date detection by default. Essentially, the value is already telling us it's type. There's a risk of false positives but due to ignore_malformed, at least that doesn't lead to the whole document being rejected. It's also easy to fix by adding a mapping for the field. There's some performance overhead, but that only comes into play for new string fields that aren't in the mapping, yet (this includes rollovers). |
Really, you think it's deliberate? If so, I am sure there are good reasons, but it's very counter-intuitive to me that my custom setting would be ignored because of a default behavior, that I consider a fallback - use if nothing else is defined for a field.
Yes, that's right. |
Currently in the templates for
logs-*-*, metrics-*-*
etc. date detection is set to false. These templates were introduced with #57629 and the discussion around the defaults happened here. Recently the discussion popped up to potentially change this default. This issue is to have a place for this discussion and persist the decision.Reasons to keep it disabled
One of the initial reasons to keep it disabled was that it could lead to documents fail to ingest. For example a field is detected as date but the follow up documents contain different values. This concern is become less sever as we have introduced
ignore_malformed
for all fields and failure store is coming along. Having date_detection on by default could also have a performance impact.If we change the default now, we also need to discuss if some users would consider this a breaking change.
ECS mappings enough?
Another change that has happened since the initial discussion is the introduction of dynamic ECS templates. These ECS templates contain a block for matching various names to date, for example
*.timestamp
. Is this enough? Can we encourage users that want to have automatic matching to use one of the names here for their fields?Overwriting the default
Today it is already possible to overwrite the default by using
logs@custom
. As soon as data streams roll over, the new default is applied and it will also be persisted during upgrades. If users are using an integration,logs-{your-dataset}@custom
can be overwritten to make the change more local. But can we make a change even easier for example by bringing it to the UI or make it a setting per data stream instead of the template? Could we have a switch in Stack Management to toggle date detection on and off?There is also work ongoing by the team around @flash1293 to detect wrong mappings in fields. Can we detect that a field should be a date and recommend a mapping change?
The text was updated successfully, but these errors were encountered: