-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default LogsDB value for ignore_dynamic_beyond_limit
#115265
Default LogsDB value for ignore_dynamic_beyond_limit
#115265
Conversation
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
@elasticmachine update branch |
There are no new commits on the base branch. |
@elasticmachine update branch |
There are no new commits on the base branch. |
@elasticmachine update branch |
index.mapping.total_fields.ignore_dynamic_beyond_limit
ignore_dynamic_beyond_limit
public static final IndexVersion UPGRADE_TO_LUCENE_10_0_0 = def(9_000_00_0, Version.LUCENE_10_0_0); | ||
|
||
public static final IndexVersion LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT = def(9_000_00_1, Version.LUCENE_10_0_0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if this is okay to backport? I suspect there could be problems when backporting to 8.16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it is
-
No issue for users who have explicitly set the value (whether
false
ortrue
) since that overrides the default. -
Change in behavior for users relying on the default false: After upgrading, documents that would have been
rejected (due to exceeding the field limit) will now be indexed with extra fields ignored only for LogsDB anyway.
While this change doesn't technically "break" anything, it may surprise users who were relying on documents being rejected in case of mapping issues. Moving from default false
to default true
is eventually going to accept more documents which would have been rejected before. So at least by changing this we are not "throwing more errors" but likely we will have less indexing errors.
I am fine backporting only to 8.17
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your reasoning and it makes sense. I suspect that just technically this may be hard to backport even to 8.17. F.e. UPGRADE_TO_LUCENE_10_0_0
does not exist in 8.x. We may need guidance from someone on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on recent version in this file, shouldn't new version be: 9_001_00_0
? Since we're are not creating a patch version?
When back porting, I don't think the same version id can be used. Based on the latest index version in 8.x, I believe this should be: 8_518_00_0
. However after reading the comment in IndexVersions
class, I don't fully understand how the main branch should updated after the backport has merged. I think a new index version constant needs to be added with version id 8_518_00_0
? We have then two versions, for stateful and stateless. But I may be wrong here. @thecoop would you be able to give advice here what index versioning and back porting now that main uses a different major index version?
Additionally I also think we don't need to backport to this 8.16 branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that when backporting this we have a compiler failure because the index version I added here uses a Lucene version that does not exist in the other branch. In that case I expect for the backport I just need to add the appropriate new version (the next one). Something like 8_518_00_0
or a newer one if this one is already created by another backport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When backporting, you need to introduce a new version constant 8_518_00_0
which represents 'this change, but with lucene 9' as a backport, and checks on the 8.x branch need to use this constant like indexVersion >= BACKPORT_CONSTANT
You then need to also understand that new constant on main
- so when you're checking whether this new functionality is present, the check needs to be something like indexVersion.between(BACKPORT_VERSION, 9_000_00_0) || indexVersion >= MAIN_CONSTANT
@elasticmachine update branch |
@elasticmachine update branch |
@@ -176,7 +176,7 @@ private void forEncode(int token, int tokenBits, long[] in, DataOutput out) thro | |||
* Encode the given longs using a combination of delta-coding, GCD factorization and bit packing. | |||
*/ | |||
void encode(long[] in, DataOutput out) throws IOException { | |||
assert in.length == numericBlockSize; | |||
// assert in.length == numericBlockSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these asserts commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a mistake in merging main
probably. This file is not expected to be touched here.
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
There are no new commits on the base branch. |
@elasticmachine update branch |
There are no new commits on the base branch. |
@elasticmachine update branch |
There are no new commits on the base branch. |
IndexVersions.UPGRADE_TO_LUCENE_10_0_0 | ||
) || indexVersionCreated.onOrAfter(IndexVersions.LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT); | ||
return String.valueOf(isLogsDBIndexMode && isNewIndexVersion); | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thecoop @martijnvg I update the check here to include a new version LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT_BACKPORT
which uses Lucene 9.
When backporting this, I will just remove any reference to the one required in main LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT
and just do a check on version being onOrAfter
version LOGSDB_DEFAULT_IGNORE_DYNAMIC_BEYOND_LIMIT_BACKPORT
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexVersion LGTM
@elasticmachine update branch |
@elasticmachine update branch |
When ingesting logs, it's important to ensure that documents are not dropped due to mapping issues, also when dealing with dynamically mapped fields. Elasticsearch provides two key settings that help manage the total number of field mappings and handle situations where this limit might be exceeded: 1. **`index.mapping.total_fields.limit`**: This setting defines the maximum number of fields allowed in an index. If this limit is reached, any further mapped fields would cause indexing to fail. 2. **`index.mapping.total_fields.ignore_dynamic_beyond_limit`**: This setting determines whether Elasticsearch should ignore any dynamically mapped fields that exceed the limit defined by `index.mapping.total_fields.limit`. If set to `false`, indexing will fail once the limit is surpassed. However, if set to `true`, Elasticsearch will continue indexing the document but will silently ignore any additional dynamically mapped fields beyond the limit. To prevent indexing failures due to dynamic mapping issues, especially in logs where the schema might change frequently, we change the default value of **`index.mapping.total_fields.ignore_dynamic_beyond_limit` from `false` to `true` in LogsDB**. This change ensures that even when the number of dynamically mapped fields exceeds the set limit, documents will still be indexed, and additional fields will simply be ignored rather than causing an indexing failure. This adjustment is important for LogsDB, where dynamically mapped fields may be common, and we want to make sure to avoid documents from being dropped.
When ingesting logs, it's important to ensure that documents are not dropped due to mapping issues, also when dealing with dynamically mapped fields. Elasticsearch provides two key settings that help manage the total number of field mappings and handle situations where this limit might be exceeded:
index.mapping.total_fields.limit
: This setting defines the maximum number of fields allowed in an index. If this limit is reached, any further mapped fields would cause indexing to fail.index.mapping.total_fields.ignore_dynamic_beyond_limit
: This setting determines whether Elasticsearch should ignore any dynamically mapped fields that exceed the limit defined byindex.mapping.total_fields.limit
. If set tofalse
, indexing will fail once the limit is surpassed. However, if set totrue
, Elasticsearch will continue indexing the document but will silently ignore any additional dynamically mapped fields beyond the limit.To prevent indexing failures due to dynamic mapping issues, especially in logs where the schema might change frequently, we change the default value of
index.mapping.total_fields.ignore_dynamic_beyond_limit
fromfalse
totrue
in LogsDB. This change ensures that even when the number of dynamically mapped fields exceeds the set limit, documents will still be indexed, and additional fields will simply be ignored rather than causing an indexing failure.This adjustment is important for LogsDB, where dynamically mapped fields may be common, and we want to make sure to avoid documents from being dropped.