Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store dynamic mapping updates as builders #90674

Merged
merged 17 commits into from
Sep 19, 2023

Conversation

romseygeek
Copy link
Contributor

Currently, newly created dynamic mappers as stored on the parser context as
instantiated mappers. This means that when we build the mapping update from
the collected dynamic mappers, we need to convert them back into builders
again, and immediately re-build them. This commit changes the context to
instead store just the builders, which results in a small simplification to the
dynamic update build logic and removes another method on ObjectMapper.Builder.

@romseygeek romseygeek added :Search Foundations/Mapping Index mappings, including merging and defining field types >refactoring v8.6.0 labels Oct 5, 2022
@romseygeek romseygeek self-assigned this Oct 5, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Oct 5, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna javanna self-requested a review October 6, 2022 10:46
@kingherc kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022
@rjernst rjernst added v8.8.0 and removed v8.7.0 labels Feb 8, 2023
@gmarouli gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023
@quux00 quux00 added v8.11.0 and removed v8.10.0 labels Aug 16, 2023
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of #87866 . I am rusty on mappings code, and remember vaguely some performance regression I introduced with #87866 but I can't remember the root cause.

There are some subtleties that I'd like to better understand. It feels like a good change, I am just a bit paranoid that it may break something. I'd like to have a deeper look.

}
}
if (dynamicMappers.containsKey(fullName)) {
// we've got the same dynamic mapper added in two ways, maybe due to a difference
// in dot notation: { "a" : { "b" : "value1" }, "a.b" : "value2" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this possible in practice? we either expand dots or subobjects are disallowed? Would it be possible to instead replace this with an assertion to enforce that it never really happens?

ObjectMapper.Builder builder = context.getDynamicObjectMapper(fullName);
if (builder != null) {
// we re-use builder instances so if the builder has already been
// added we don't need to do so again
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would cause this situation in practice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the input document has mixed object and dot notation for fields then we can end up creating the same parent object builder twice.

for (Mapper mapper : context.getDynamicMappers()) {
rootBuilder.addDynamic(mapper.name(), null, mapper, context);
}
context.getDynamicMappers().forEach((name, builder) -> rootBuilder.addDynamic(name, null, builder, context));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just cosmetic?

// we could potentially skip the step of adding these to the dynamic mappers, because their parent is already added to
// that list, and what is important is that all of the intermediate objects are added to the dynamic object mappers so that
// they can be looked up once sub-fields need to be added to them. For simplicity, we treat these like any other object
addDynamicMapper(submapper);
addDynamicMapper(fullName + "." + submapper.name, submapper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the full name is actually the path? How does this work with subobjects false? Is it then empty and it's all in the leaf name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never gets called with subobjects:false because we're explicitly dealing with subobjects here.

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea here a lot :) can you merge in main and resolve conflicts so we have a clean diff for a final review?

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, to me it seems like this should be faster even after going over it a couple times :) Could still be missing something too obviously but I think it's a reasonable risk to take 🤞

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

@romseygeek romseygeek merged commit 9a87670 into elastic:main Sep 19, 2023
@romseygeek romseygeek deleted the mapper/dynamic-builders branch September 19, 2023 13:38
@felixbarny
Copy link
Member

This conflicts with #96235. I'm trying to get the size of all dynamic mappers in DocumentParserContext#addDynamicMapper by calling a new method in Mapper#mapperSize that also takes into account the size of the child mappers, for example if a field has multi-fields. This is important to evaluate whether or not the field limit has been reached. I don't think the mapperSize method can easily be implemented on top of Mapper.Builder as it doesn't have access to its children.

@felixbarny
Copy link
Member

Maybe it's not as difficult to implement mapperSize in Mapper.Bulider as it first seemed. While there's no access to child mappers/multi-fields in the abstract Mapper.Builder class itself, we can access that in the abstract classes ObjectMapper.Builder and FieldMapper.Builder.

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jan 3, 2024
elasticsearchmachine pushed a commit that referenced this pull request Jan 4, 2024
Reverts #90674

The revert is not perfectly clean as there are some minor adjustments to
account for later changes.

This is in contrast with:
#103858

closes: #103011
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jan 4, 2024
Reverts elastic#90674

The revert is not perfectly clean as there are some minor adjustments to
account for later changes.

This is in contrast with:
elastic#103858

closes: elastic#103011
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jan 4, 2024
Reverts elastic#90674

The revert is not perfectly clean as there are some minor adjustments to
account for later changes.

This is in contrast with:
elastic#103858

closes: elastic#103011
elasticsearchmachine pushed a commit that referenced this pull request Jan 4, 2024
Reverts #90674

The revert is not perfectly clean as there are some minor adjustments to
account for later changes.

This is in contrast with:
#103858

closes: #103011
elasticsearchmachine pushed a commit that referenced this pull request Jan 4, 2024
Reverts #90674

The revert is not perfectly clean as there are some minor adjustments to
account for later changes.

This is in contrast with:
#103858

closes: #103011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.