Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add size or number of fields limit on dynamic mapping #11443

Closed
yanjunh opened this issue Jun 1, 2015 · 13 comments
Closed

Add size or number of fields limit on dynamic mapping #11443

yanjunh opened this issue Jun 1, 2015 · 13 comments
Labels
>enhancement help wanted adoptme :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@yanjunh
Copy link
Contributor

yanjunh commented Jun 1, 2015

Object mapping are dynamic by default. The mapping size get exploded when incoming document stream contains dynamic keys such as uuid string, time string etc. For example, {"fields":{"de305d54-75b4-431b-adb2-eb6b9e546014":{...}}}. It can kill the cluster very fast at high indexing rate. We need a way to limit mapping by size or by number of fields. When the limit is reached, the object mapping behavior changes from dynamic:true to dynamic:false.

@clintongormley
Copy link
Contributor

Hi @yanjunh

This feels like the wrong approach. Why would we index field foo-bar-123 and reject foo-bar-124? Why index the first field at all, if you're just going to throw away the next field arbitrarily?

I'm afraid the proposed solution is not a solution at all. Instead, you need to change how you're indexing the data, or how you're adding fields.

@nik9000
Copy link
Member

nik9000 commented Jun 2, 2015

I think the idea is to catch mistakes and prevent the cluster from becoming unstable.

@clintongormley
Copy link
Contributor

@nik9000 sure, but this is still the wrong solution. we're not going to add a setting which arbitrarily rejects fields if X many fields still exist.

@yanjunh
Copy link
Contributor Author

yanjunh commented Jun 2, 2015

@clintongormley We have been telling developers about the limitation but we can't prevent them from producing unfriendly data. Thought about detecting it on when data flow through Logstash but that's not a good place. What's the recommended way to guard against unbounded mapping growth?

@nik9000
Copy link
Member

nik9000 commented Jun 2, 2015

What's the recommended way to guard against unbounded mapping growth?

Strong language.

Seriously though, for my use case I turn mapping to non-dynamic and only add fields "intentionally" but that's not always possible.

Also - I imagine this is something you could catch in code review and by giving all developers their own instance to test against. But I'm still for adding some extra defence against it to Elasticsearch.

@yanjunh
Copy link
Contributor Author

yanjunh commented Jun 2, 2015

@nik9000 Forgive my wording
We have no control on what people log. So far I can only set alerts when problem happens. Even so it's often too late because the data are coming in so fast.

@clintongormley
Copy link
Contributor

@yanjunh I'll reopen the issue for further discussion.

@clintongormley clintongormley reopened this Jun 2, 2015
@clintongormley clintongormley added discuss :Search Foundations/Mapping Index mappings, including merging and defining field types labels Jun 2, 2015
@jpountz
Copy link
Contributor

jpountz commented Jun 3, 2015

Going from dynamic:true to dynamic:false is not an option in my opinion. If you index the same set of documents twice, you would get different mappings depending on the order in which documents have been indexed.

I would be open to discuss switching from dynamic:true to dynamic:strict however, so that the cause of the issue would be propagated to the client which is indexing the document. I can see value in admins being able to protect against client-side bugs and/or enforce consumers of elasticsearch to follow some good practices.

@nik9000
Copy link
Member

nik9000 commented Jun 3, 2015

I would be open to discuss switching from dynamic:true to dynamic:strict however, so that the cause of the issue would be propagated to the client which is indexing the document.

If dynamic:strict rejects the document then I'm +1 on that. So long as the setting is dynamically update-able then it shouldn't prevent people from doing this if they want - just get in their way and remind them its a bad idea.

@yanjunh
Copy link
Contributor Author

yanjunh commented Jun 3, 2015

I agree with dynamic:strict. My original thought is to keep the good part of the document and just drop the bad field. Dropping whole document might be a better idea. That will force the data producers to structure their data in the right way. The data consumers can then effectively search and aggregate on the right structured data. In either approach, being able to protect the search cluster from bad data can be very useful.

@segalziv
Copy link

+1 for setting a limit on the number of (dynamic) fields, and moving to dynamic:false when that limit is reached.

I'm also trying to protect elasticsearch from becoming unstable due to too many fields.

@breml
Copy link
Contributor

breml commented Nov 15, 2015

We use a custom analyzer to mark dynamically added fields, which allows us to monitor the count of added fields. More explanation in the following blog post: https://breml.github.io/blog/2015/10/28/tagging-of-dynamically-added-fields-in-elasticsearch/

@ppf2
Copy link
Member

ppf2 commented Jan 6, 2016

+1 on providing an option for the safeguard. Another use case is key value logging and using the LS KV filter with dynamic mappings so that these fields are automatically extracted and generated. Having a limit the user can set can help prevent unintentional explosion in mappings when someone decides to use UUIDs as keys, etc..

@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement help wanted adoptme :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

8 participants