Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index level setting to unmap fields beyond total fields limit #14939

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

gaobinlong
Copy link
Collaborator

@gaobinlong gaobinlong commented Jul 24, 2024

Description

This PR introduces a new index level setting index.mapping.total_fields.unmap_fields_beyond_limit, the default value is false, if it's set to true, then all the new detected fields beyond the index.mapping.total_fields.limit will not be added to the mapping, not indexed, not searchable but still be stored in the _source field.

Example usage:

  1. Create an index
PUT index1
{
  "settings": {
    "index.mapping.total_fields.limit":3,
    "index.mapping.total_fields.unmap_fields_beyond_limit":true
  }
}
  1. Index a document
POST index1/_doc/1
{
  "field1": "field1",
  "field2": 10000,
  "field3": [
    1,
    2,
    3
  ],
  "field4": {
    "field5": "field5"
  }
}
  1. Check the document
GET index1/_doc/1
{
  "_index": "index1",
  "_id": "1",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "field1": "field1",
    "field2": 10000,
    "field3": [
      1,
      2,
      3
    ],
    "field4": {
      "field5": "field5"
    }
  }
}
  1. Check the mapping
GET index1/_mapping
{
  "index1": {
    "mappings": {
      "properties": {
        "field1": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "field2": {
          "type": "long"
        }
      }
    }
  }
}

, we can see that the fields field3, field4 and field5 are not in the mapping because the total fields limit is 3.

Related Issues

#14031

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Copy link
Contributor

✅ Gradle check result for 1307156: SUCCESS

Copy link
Contributor

✅ Gradle check result for e13f314: SUCCESS

Copy link

codecov bot commented Jul 24, 2024

Codecov Report

Attention: Patch coverage is 91.83673% with 4 lines in your changes missing coverage. Please review.

Project coverage is 71.95%. Comparing base (d442f7c) to head (9495dae).
Report is 1 commits behind head on main.

Files Patch % Lines
...va/org/opensearch/index/mapper/DocumentParser.java 90.90% 1 Missing and 1 partial ⚠️
...ava/org/opensearch/index/mapper/MappingLookup.java 93.33% 0 Missing and 1 partial ⚠️
...java/org/opensearch/index/mapper/ParseContext.java 83.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #14939      +/-   ##
============================================
- Coverage     71.95%   71.95%   -0.01%     
+ Complexity    63128    63092      -36     
============================================
  Files          5196     5196              
  Lines        295311   295357      +46     
  Branches      42677    42683       +6     
============================================
+ Hits         212496   212528      +32     
+ Misses        65436    65380      -56     
- Partials      17379    17449      +70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gaobinlong gaobinlong added backport 2.x Backport to 2.x branch Indexing Indexing, Bulk Indexing and anything related to indexing enhancement Enhancement or improvement to existing feature or request labels Jul 24, 2024
return true;
}

List<ObjectMapper> newObjectMappers = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method and the one below look expensive to be happening on every new field addition. Can you optimize these?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vikasvb90 , I've made some optimization for it, the main point is to add a new field dynamicFieldCount to the class MappingLookup, when a new dynamic mapper is added, we increase the dynamicFieldCount, then we check if the current total fields count exceed the limit, so we don't need to calculate the dynamic field count from the dynamicMappers every time the check happens. Please help to take a look, thank you!

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Copy link
Contributor

✅ Gradle check result for 52d051d: SUCCESS

Copy link
Contributor

✅ Gradle check result for 814f715: SUCCESS

Copy link
Contributor

❌ Gradle check result for 553f304: ABORTED

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Copy link
Contributor

❕ Gradle check result for 456e8ba: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Copy link
Contributor

❌ Gradle check result for 9495dae: ABORTED

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 9495dae: SUCCESS

Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Copy link
Contributor

❕ Gradle check result for 9f15262: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing stalled Issues that have stalled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants