-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search 'fields' option design + implementation #55363
Comments
Pinging @elastic/es-search (:Search/Search) |
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (elastic#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.
This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.
This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (elastic#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.
This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (elastic#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.
This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.
This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
I thought more about the question of whether we should apply
Tagging @jpountz @jimczi @javanna @nik9000 in case they have thoughts on the above, happy to discuss here! |
This commit adds the capability to `FieldTypeLookup` to retrieve a field's paths in the _source. When retrieving a field's values, we consult these source paths to make sure we load the relevant values. This allows us to handle requests for field aliases and multi-fields. We also retrieve values that were copied into the field through copy_to. To me this is what users would expect out of the API, and it's consistent with what comes back from `docvalues_fields` and `stored_fields`. However it does add some complexity, and was not something flagged as important from any of the clients I spoke to about this API. I'm looking for feedback on this point. Relates to #55363.
+1 to support |
@jtibshirani when you say "by default" does that mean it can be disabled? From the point of view of "when you retrieve, you get what you sent, when you search and aggregate, you search on and get back what was indexed" I am torn, I would personally expect the raw value loaded from source. Though if users can control what they get, I would not mind that the default is normalized. |
@javanna I was indeed wondering if we could make the behavior configurable, but don't have immediate plans to do so. It's always nice to avoid configuration options and have strong defaults. I have also been on the fence about this, I can see arguments both ways. I suggest that we move forward with normalizing the values for now. I'm going to ask the teams planning to use this feature (SQL, ML, Kibana) to try to integrate with it before we ship it, and have a short list of questions I plan to ask them which includes keyword normalization. The questions are tracked in the issue description. |
We discussed how geo fields should be returned with @talevy and @imotov. A summary of our discussion:
|
An additional note: now that we'll return points in geojson format by default, for consistency we should accept this format when indexing points. We currently don't allow geojson, the work to add support is tracked in #47815. |
The feature branch was merged in #60100. I'll open new issues/ PRs for the follow-up improvements mentioned in the description. |
Original issue: #49028
Feature branch: field-retrieval
Docs: https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-fields.html
Motivation
Often a user wants to retrieve a particular set of fields during a search. Currently, we don't support this usage pattern in a good way. In short, given a list of fields, there is no easy way to load all of their values:
Better field retrieval support is becoming even more important now that we're introducing more field types that don’t fit the typical pattern like
constant_keyword
and the proposed runtime fields (#48063).Feature Summary
We plan to add a new
fields
section to the search request, which users would specify instead of using source filtering to load fields from source:Both full field names and wildcard patterns are accepted. Only leaf fields are returned, the API will not allow for fetching object values. The fields are returned as a flat list in the
fields
section in each hit, the same as we do fordocvalue_fields
andscript_fields
.Overall, the API gives a friendly way to load fields from source:
fields
, then we’ll consult the mappings to find and return the right value.format
parameter as we do fordocvalue_fields
to allow for adjusting the format of the results.Some clarifications:
Implementation Plan
fields
section in the search request that fetches values from source. (Add a simple 'fetch fields' phase. #55639)ignore_malformed
. (Allow field mappers to retrieve fields from source. #56928)ignore_above
. (Fix casting of scaled_float in sorts (backport of #57207) #57385)format
parameter. (Deprecte Rounding#round (backport #57845) #57893)null_value
. (Respect null_value parameter in the fields retrieval API. #58623)Future improvements:
FieldMapper#lookupValues
toMappedFieldType
. (?)_size
.source
documents to speed upsource
access #52591.inner_hits
.Open Questions
_source
that have been disabled in the mappings (enabled: false
)?keyword
fields, should we apply thenormalizer
or return the original value?The text was updated successfully, but these errors were encountered: