-
Notifications
You must be signed in to change notification settings - Fork 0
Add default sort for message.template_id field in logsdb indices #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @visz11, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new capability for LogsDB indices, allowing users to configure a default sort order that includes the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
WalkthroughAdds a new index setting index.logsdb.sort_on_message_template, updates default LogsDB index-sort assembly to optionally include message.template_id, extends LogsdbIndexModeSettingsProvider to derive this from settings/mappings, and introduces corresponding unit, integration, and REST tests. Also registers the setting in built-in index settings. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Template as Index Template/Mapping
participant Provider as LogsdbIndexModeSettingsProvider
participant Settings as IndexSettings
participant SortConfig as IndexSortConfig
Client->>Template: Create data stream/index with mappings
Template->>Settings: Apply index settings (includes logsdb flags)
Settings->>Provider: Resolve MappingHints
Note over Provider: Determine sortOnHostName and sortOnMessageTemplate<br/>- from settings<br/>- or mapping (PatternText on message)
Provider-->>Settings: MappingHints{ sortOnHostName, sortOnMessageTemplate, ... }
Settings->>SortConfig: Request default sort specs (LogsDB mode)
SortConfig->>SortConfig: Build specs list<br/>[host.name?] + [message.template_id?] + @timestamp
SortConfig-->>Settings: FieldSortSpec[]
Settings-->>Client: Index created with index.sort.*
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a default sort on the message.template_id
field for logsdb
indices, which is a valuable enhancement for log data management. The implementation is solid, adding a new index setting and automatically enabling it when a pattern_text
mapping is used for the message
field. The changes are well-tested across different levels. My review includes a suggestion to improve code consistency for maintainability and a fix for a minor typo in an integration test.
{ | ||
"_doc": { | ||
"properties": { | ||
"@timestmap": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hostnameSpec.missingValue = "_last"; | ||
hostnameSpec.mode = MultiValueMode.MIN; | ||
HOSTNAME_TIMESTAMP_SORT = new FieldSortSpec[] { hostnameSpec, timeStampSpec }; | ||
MESSAGE_PATTERN_SPEC = new FieldSortSpec("message.template_id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency and readability, it would be better to explicitly define the properties for MESSAGE_PATTERN_SPEC
, similar to how HOSTNAME_SPEC
is defined. While the default values would be correctly applied, being explicit makes the configuration clearer. You can use a constructor overload and set the mode
property to make it compact.
MESSAGE_PATTERN_SPEC = new FieldSortSpec("message.template_id"); | |
MESSAGE_PATTERN_SPEC = new FieldSortSpec("message.template_id", SortOrder.ASC, "_last"); | |
MESSAGE_PATTERN_SPEC.mode = MultiValueMode.MIN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java (3)
60-66
: Mapping includes expanded to message fields — OK; consider narrowing for perfLooks good for enabling message/template detection. Optionally narrow to just properties.message and properties.message.template_id (and their _doc.* variants) to reduce mapping payload processed.
258-265
: Early-return condition became stricter; restore host-only fast pathPreviously, host-name add/sort being set would short-circuit without creating a MapperService. Now it requires message-template too, potentially increasing overhead. Relax the condition and propagate the message-template flag independently.
- if (IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(indexTemplateAndCreateRequestSettings) - && IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings) - && IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings)) { - // Settings for adding and sorting on [host.name] are already set, propagate them. - return new MappingHints(hasSyntheticSourceUsage, true, true, true, true); - } + if (IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(indexTemplateAndCreateRequestSettings) + && IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings)) { + // Settings for adding and sorting on [host.name] are already set, propagate them. + boolean sortOnMsg = IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings); + return new MappingHints(hasSyntheticSourceUsage, true, true, sortOnMsg, true); + }
309-317
: Derive sortOnMessageTemplate from message.template_id presence/doc valuesChecking the mapper type of message is an indirect signal. Prefer verifying the concrete sub-field used for sorting (message.template_id) exists and has doc values. This avoids false positives if message is PatternText but template_id isn’t available.
- boolean sortOnMessageTemplate = IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings) - || (messageField instanceof PatternTextFieldMapper); + boolean sortOnMessageTemplate = IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings) + || (mapperService.mappingLookup().getMapper("message.template_id") instanceof KeywordFieldMapper tfm + && tfm.fieldType().hasDocValues());server/src/main/java/org/elasticsearch/index/IndexSortConfig.java (1)
110-125
: Avoid reusing mutable static FieldSortSpec instancesThe static FieldSortSpec objects (HOSTNAME_SPEC, MESSAGE_PATTERN_SPEC, TIMESTAMP_SPEC) are mutable. Returning them directly risks accidental mutation leaking across callers/tests. Prefer adding copies to the list.
Apply this diff within the assembly to copy specs before adding:
- List<FieldSortSpec> sortSpecs = new ArrayList<>(3); - if (IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(settings)) { - sortSpecs.add(HOSTNAME_SPEC); - } - if (IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(settings)) { - sortSpecs.add(MESSAGE_PATTERN_SPEC); - } - sortSpecs.add(TIMESTAMP_SPEC); + List<FieldSortSpec> sortSpecs = new ArrayList<>(3); + if (IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(settings)) { + var s = new FieldSortSpec(HOSTNAME_SPEC.field); + s.order = HOSTNAME_SPEC.order; + s.missingValue = HOSTNAME_SPEC.missingValue; + s.mode = HOSTNAME_SPEC.mode; + sortSpecs.add(s); + } + if (IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(settings)) { + var s = new FieldSortSpec(MESSAGE_PATTERN_SPEC.field); + s.order = MESSAGE_PATTERN_SPEC.order; + s.missingValue = MESSAGE_PATTERN_SPEC.missingValue; + s.mode = MESSAGE_PATTERN_SPEC.mode; + sortSpecs.add(s); + } + { + var s = new FieldSortSpec(TIMESTAMP_SPEC.field); + s.order = TIMESTAMP_SPEC.order; + s.missingValue = TIMESTAMP_SPEC.missingValue; + s.mode = TIMESTAMP_SPEC.mode; + sortSpecs.add(s); + }Optionally, add a small helper to reduce repetition:
private static FieldSortSpec copy(FieldSortSpec src) { var s = new FieldSortSpec(src.field); s.order = src.order; s.missingValue = src.missingValue; s.mode = src.mode; return s; }Also applies to: 155-165
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
server/src/main/java/org/elasticsearch/common/settings/IndexScopedSettings.java
(1 hunks)server/src/main/java/org/elasticsearch/index/IndexSettings.java
(1 hunks)server/src/main/java/org/elasticsearch/index/IndexSortConfig.java
(3 hunks)server/src/test/java/org/elasticsearch/index/IndexSortSettingsTests.java
(1 hunks)x-pack/plugin/logsdb/src/internalClusterTest/java/org/elasticsearch/xpack/logsdb/LogsdbSortConfigIT.java
(1 hunks)x-pack/plugin/logsdb/src/javaRestTest/java/org/elasticsearch/xpack/logsdb/patterntext/PatternTextBasicLicenseIT.java
(1 hunks)x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java
(6 hunks)x-pack/plugin/logsdb/src/test/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProviderTests.java
(2 hunks)x-pack/plugin/logsdb/src/yamlRestTest/resources/rest-api-spec/test/80_index_sort_defaults.yml
(1 hunks)
🔇 Additional comments (12)
x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java (3)
212-220
: MappingHints expansion — LGTM
298-299
: Message field lookup — LGTMRetrieving the mapper for "message" after filtering mappings is correct.
160-163
: Guard message-template sort behind templating being allowedIf templating is disabled by license, enabling LOGSDB_SORT_ON_MESSAGE_TEMPLATE can create inconsistent settings relative to the actual mapping. Guard this with the license check.
Apply this diff:
- if (mappingHints.sortOnMessageTemplate) { - additionalSettings.put(IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.getKey(), true); - } + if (mappingHints.sortOnMessageTemplate) { + if (licenseService.allowPatternTextTemplating(isTemplateValidation)) { + additionalSettings.put(IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.getKey(), true); + } else { + LOGGER.debug( + "creation of index [{}]: skipping message-template sort because templating is disabled", + indexName + ); + } + }To double-check wiring of the new setting through the default index-sort assembly, run:
server/src/main/java/org/elasticsearch/common/settings/IndexScopedSettings.java (1)
200-201
: Built-in setting registration looks correctLOGSDB_SORT_ON_MESSAGE_TEMPLATE is properly added to the built-in index settings set.
server/src/main/java/org/elasticsearch/index/IndexSettings.java (1)
671-677
: New LogsDB setting is well-definedBoolean, IndexScope, PrivateIndex, Final — consistent with other LogsDB flags.
x-pack/plugin/logsdb/src/test/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProviderTests.java (2)
981-984
: License gating assertion is appropriateAsserting DISABLE_TEMPLATING when pattern_text is present under basic license is correct.
999-1009
: Propagation of message-template sort flag verifiedGood coverage that the new flag passes through provider settings.
server/src/test/java/org/elasticsearch/index/IndexSortSettingsTests.java (2)
335-354
: Unit test for message-template sort defaults looks goodValidates two-spec sort: message.template_id asc/min, @timestamp desc/max.
355-378
: Combined hostname + message + timestamp sort test is solidCovers ordering, mode, and missing semantics for all three fields.
x-pack/plugin/logsdb/src/yamlRestTest/resources/rest-api-spec/test/80_index_sort_defaults.yml (2)
132-186
: REST spec: message.template_id + timestamp defaultsGood coverage asserting fields, order, mode, and missing; matches server defaults.
189-241
: REST spec: host.name + message.template_id + timestamp defaultsValidates full tri-field sort defaults. Looks correct.
x-pack/plugin/logsdb/src/javaRestTest/java/org/elasticsearch/xpack/logsdb/patterntext/PatternTextBasicLicenseIT.java (1)
1-145
: New IT validates basic-license behavior and LogsDB sort defaultsAsserts templating disabled and correct sort fields on backing indices; covers rollover. LGTM.
final String mapping = """ | ||
{ | ||
"_doc": { | ||
"properties": { | ||
"@timestmap": { | ||
"type": "date" | ||
}, | ||
"host.name": { | ||
"type": "keyword" | ||
}, | ||
"message": { | ||
"type": "pattern_text" | ||
}, | ||
"test_id": { | ||
"type": "text", | ||
"store": true | ||
} | ||
} | ||
} | ||
} | ||
"""; | ||
|
||
final DocWithId[] orderedDocs = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix mapping typo: @timestmap → @timestamp
The mapping defines "@timestmap" (typo). Use "@timestamp" to align with data stream timestamp and test assertions.
- "@timestmap": {
+ "@timestamp": {
"type": "date"
},
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
final String mapping = """ | |
{ | |
"_doc": { | |
"properties": { | |
"@timestmap": { | |
"type": "date" | |
}, | |
"host.name": { | |
"type": "keyword" | |
}, | |
"message": { | |
"type": "pattern_text" | |
}, | |
"test_id": { | |
"type": "text", | |
"store": true | |
} | |
} | |
} | |
} | |
"""; | |
final DocWithId[] orderedDocs = { | |
final String mapping = """ | |
{ | |
"_doc": { | |
"properties": { | |
"@timestamp": { | |
"type": "date" | |
}, | |
"host.name": { | |
"type": "keyword" | |
}, | |
"message": { | |
"type": "pattern_text" | |
}, | |
"test_id": { | |
"type": "text", | |
"store": true | |
} | |
} | |
} | |
} | |
"""; | |
final DocWithId[] orderedDocs = { |
🤖 Prompt for AI Agents
In
x-pack/plugin/logsdb/src/internalClusterTest/java/org/elasticsearch/xpack/logsdb/LogsdbSortConfigIT.java
around lines 82 to 104 the mapping contains a typo: the field is defined as
"@timestmap" but should be "@timestamp". Update the mapping string to rename
"@timestmap" to "@timestamp" (keeping type "date") so it matches the data stream
timestamp field used by the test assertions.
/refacto-visz |
1 similar comment
/refacto-visz |
Refacto is reviewing this PR. Please wait for the review comments to be posted. |
Code Review: LogsDB Sort Configuration Enhancement👍 Well Done
📁 Selected files for review (9)
🎯 Custom Instructions
📝 Additional Comments
|
&& IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings) | ||
&& IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic Condition Error
Boolean logic uses AND operator requiring both hostname sorting AND hostname field addition to be true simultaneously. This creates impossible condition where message template sorting cannot be enabled independently. Logic should use OR operator to allow either hostname or message template sorting to trigger the enhanced mapping hints.
if ((IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(indexTemplateAndCreateRequestSettings)
&& IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings))
|| IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings)) {
Commitable Suggestion
&& IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings) | |
&& IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings)) { | |
if ((IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(indexTemplateAndCreateRequestSettings) | |
&& IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings)) | |
|| IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings)) { |
Standards
- Algorithm-Correctness-Boolean-Logic
- Business-Rule-Feature-Independence
- Logic-Verification-Conditional-Logic
&& IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings)) { | ||
// Settings for adding and sorting on [host.name] are already set, propagate them. | ||
return new MappingHints(hasSyntheticSourceUsage, true, true, true); | ||
return new MappingHints(hasSyntheticSourceUsage, true, true, true, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent Return Values
Hard-coded return statement assumes all boolean flags are true when condition is met. Logic inconsistency occurs because method returns true for sortOnMessageTemplate regardless of actual LOGSDB_SORT_ON_MESSAGE_TEMPLATE setting value. This bypasses proper setting evaluation and creates incorrect mapping configuration.
boolean sortOnHostName = IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(indexTemplateAndCreateRequestSettings);
boolean addHostNameField = IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings);
boolean sortOnMessageTemplate = IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings);
return new MappingHints(hasSyntheticSourceUsage, sortOnHostName, addHostNameField, sortOnMessageTemplate, true);
Commitable Suggestion
return new MappingHints(hasSyntheticSourceUsage, true, true, true, true); | |
boolean sortOnHostName = IndexSettings.LOGSDB_SORT_ON_HOST_NAME.get(indexTemplateAndCreateRequestSettings); | |
boolean addHostNameField = IndexSettings.LOGSDB_ADD_HOST_NAME_FIELD.get(indexTemplateAndCreateRequestSettings); | |
boolean sortOnMessageTemplate = IndexSettings.LOGSDB_SORT_ON_MESSAGE_TEMPLATE.get(indexTemplateAndCreateRequestSettings); | |
return new MappingHints(hasSyntheticSourceUsage, sortOnHostName, addHostNameField, sortOnMessageTemplate, true); |
Standards
- Algorithm-Correctness-Return-Logic
- Business-Rule-Setting-Evaluation
- Logic-Verification-State-Consistency
{ | ||
"_doc": { | ||
"properties": { | ||
"@timestmap": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo Field Reference
Field name contains typo '@timestmap' instead of '@timestamp' in mapping configuration. Incorrect field reference prevents proper timestamp handling causing mapping failures. Index creation reliability affected by configuration validation errors.
Standards
- ISO-IEC-25010-Functional-Correctness-Appropriateness
- DbC-Precondition-Validation
hostnameSpec.missingValue = "_last"; | ||
hostnameSpec.mode = MultiValueMode.MIN; | ||
HOSTNAME_TIMESTAMP_SORT = new FieldSortSpec[] { hostnameSpec, timeStampSpec }; | ||
MESSAGE_PATTERN_SPEC = new FieldSortSpec("message.template_id"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic String Constant
Hardcoded string "message.template_id" creates coupling and maintenance risk. Field name changes require hunting through code. Extract to named constant for centralized management and refactoring safety.
Standards
- Clean-Code-Constants
- Maintainability-Quality-Coupling
- Refactoring-Extract-Constant
Fixes elastic#135031
Summary by CodeRabbit
New Features
Tests