Skip to content

Commit

Permalink
[DOCS] Move search API's docvalue_fields examples (#57760)
Browse files Browse the repository at this point in the history
Changes:

* Condenses and relocates the `docvalue_fields` example to the 'Run a search'
   page.
* Adds docs for the `docvalue_fields` request body parameter.
* Updates several related xrefs.

Co-authored-by: debadair <debadair@elastic.co>
  • Loading branch information
jrodewig and debadair committed Jun 11, 2020
1 parent 8237f6c commit bfe7850
Show file tree
Hide file tree
Showing 14 changed files with 174 additions and 105 deletions.
2 changes: 1 addition & 1 deletion docs/plugins/mapper-size.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ GET my_index/_search
{ref}/search-request-body.html#request-body-search-script-fields[script field]
to return the `_size` field in the search response.
<5> Uses a
{ref}/search-request-body.html#request-body-search-docvalue-fields[doc value
{ref}/run-a-search.html#docvalue-fields[doc value
field] to return the `_size` field in the search response. Doc value fields are
useful if
{ref}/modules-scripting-security.html#allowed-script-types-setting[inline
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ The top_hits aggregation returns regular search hits, because of this many per h
* <<request-body-search-highlighting,Highlighting>>
* <<request-body-search-explain,Explain>>
* <<request-body-search-queries-and-filters,Named filters and queries>>
* <<search-fields,Source filtering>>
* <<source-filtering,Source filtering>>
* <<request-body-search-stored-fields,Stored fields>>
* <<request-body-search-script-fields,Script fields>>
* <<request-body-search-docvalue-fields,Doc value fields>>
* <<docvalue-fields,Doc value fields>>
* <<request-body-search-version,Include versions>>
* <<request-body-search-seq-no-primary-term,Include Sequence Numbers and Primary Terms>>

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/mapping/fields/source-field.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ stored.
WARNING: Removing fields from the `_source` has similar downsides to disabling
`_source`, especially the fact that you cannot reindex documents from one
Elasticsearch index to another. Consider using
<<search-fields,source filtering>> instead.
<<source-filtering,source filtering>> instead.

The `includes`/`excludes` parameters (which also accept wildcards) can be used
as follows:
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/mapping/params/store.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Usually this doesn't matter. The field value is already part of the
<<mapping-source-field,`_source` field>>, which is stored by default. If you
only want to retrieve the value of a single field or of a few fields, instead
of the whole `_source`, then this can be achieved with
<<search-fields,source filtering>>.
<<source-filtering,source filtering>>.

In certain situations it can make sense to `store` a field. For instance, if
you have a document with a `title`, a `date`, and a very large `content`
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/mapping/types/nested.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -180,8 +180,8 @@ For instance, if a string field within a nested document has
during the highlighting, these offsets will not be available during the main highlighting
phase. Instead, highlighting needs to be performed via
<<nested-inner-hits,nested inner hits>>. The same consideration applies when loading
fields during a search through <<request-body-search-docvalue-fields, `docvalue_fields`>>
or <<request-body-search-stored-fields, `stored_fields`>>.
fields during a search through <<docvalue-fields,
`docvalue_fields`>> or <<request-body-search-stored-fields, `stored_fields`>>.
=============================================

Expand Down
9 changes: 7 additions & 2 deletions docs/reference/redirects.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -910,5 +910,10 @@ See <<paginate-search-results>>.
[role="exclude",id="request-body-search-source-filtering"]
==== Source filtering
See <<search-fields>>.
////
See <<source-filtering, source filtering>>.
[role="exclude",id="request-body-search-docvalue-fields"]
==== Doc value fields
See <<docvalue-fields, doc value fields>>.
////
73 changes: 2 additions & 71 deletions docs/reference/search/request/docvalue-fields.asciidoc
Original file line number Diff line number Diff line change
@@ -1,73 +1,4 @@
[[request-body-search-docvalue-fields]]
==== Doc value Fields
==== Doc value fields

Allows to return the <<doc-values,doc value>> representation of a field for each hit, for
example:

[source,console]
--------------------------------------------------
GET /_search
{
"query" : {
"match_all": {}
},
"docvalue_fields" : [
"my_ip_field", <1>
{
"field": "my_keyword_field" <2>
},
{
"field": "my_date_field",
"format": "epoch_millis" <3>
}
]
}
--------------------------------------------------

<1> the name of the field
<2> an object notation is supported as well
<3> the object notation allows to specify a custom format

Doc value fields can work on fields that have doc-values enabled, regardless of whether they are stored

`*` can be used as a wild card, for example:

[source,console]
--------------------------------------------------
GET /_search
{
"query" : {
"match_all": {}
},
"docvalue_fields" : [
{
"field": "*_date_field", <1>
"format": "epoch_millis" <2>
}
]
}
--------------------------------------------------

<1> Match all fields ending with `field`
<2> Format to be applied to all matching fields.

Note that if the fields parameter specifies fields without docvalues it will try to load the value from the fielddata cache
causing the terms for that field to be loaded to memory (cached), which will result in more memory consumption.

[float]
====== Custom formats

While most fields do not support custom formats, some of them do:

- <<date,Date>> fields can take any <<mapping-date-format,date format>>.
- <<number,Numeric>> fields accept a https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html[DecimalFormat pattern].

By default fields are formatted based on a sensible configuration that depends
on their mappings: `long`, `double` and other numeric fields are formatted as
numbers, `keyword` fields are formatted as strings, `date` fields are formatted
with the configured `date` format, etc.

NOTE: On its own, `docvalue_fields` cannot be used to load fields in nested
objects -- if a field contains a nested object in its path, then no data will
be returned for that docvalue field. To access nested fields, `docvalue_fields`
must be used within an <<request-body-search-inner-hits, `inner_hits`>> block.
See <<docvalue-fields, doc value fields>>.
2 changes: 1 addition & 1 deletion docs/reference/search/request/inner-hits.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Inner hits also supports the following per document features:
* <<request-body-search-explain,Explain>>
* <<request-body-search-source-filtering,Source filtering>>
* <<request-body-search-script-fields,Script fields>>
* <<request-body-search-docvalue-fields,Doc value fields>>
* <<docvalue-fields,Doc value fields>>
* <<request-body-search-version,Include versions>>
* <<request-body-search-seq-no-primary-term,Include Sequence Numbers and Primary Terms>>

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/search/request/source-filtering.asciidoc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[[request-body-search-source-filtering]]
==== Source filtering

See <<search-fields>>.
See <<source-filtering, source filtering>>.
4 changes: 2 additions & 2 deletions docs/reference/search/request/stored-fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

WARNING: The `stored_fields` parameter is about fields that are explicitly marked as
stored in the mapping, which is off by default and generally not recommended.
Use <<search-fields,source filtering>> instead to select
Use <<source-filtering,source filtering>> instead to select
subsets of the original source document to be returned.

Allows to selectively load specific stored fields for each document represented
Expand Down Expand Up @@ -62,5 +62,5 @@ GET /_search
}
--------------------------------------------------

NOTE: <<search-fields,`_source`>> and <<request-body-search-version, `version`>> parameters cannot be activated if `_none_` is used.
NOTE: <<source-filtering,`_source`>> and <<request-body-search-version, `version`>> parameters cannot be activated if `_none_` is used.

120 changes: 101 additions & 19 deletions docs/reference/search/search-fields.asciidoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,43 @@
[discrete]
[[search-fields]]
=== Return fields in a search
=== Retrieve selected fields

By default, each hit in the search response includes the document
<<mapping-source-field,`_source`>>, which is the entire JSON object that was
provided when indexing the document. If you only need certain fields in the
search response, you can use the `_source` parameter to restrict what parts of
the source are returned. This is called _source filtering_.
provided when indexing the document. If you only need certain source fields in
the search response, you can use the <<source-filtering,source filtering>> to
restrict what parts of the source are returned.

Returning fields using only the document source has some limitations:

* The `_source` field does not include <<multi-fields, multi-fields>> or
<<alias, field aliases>>. Likewise, a field in the source does not contain
values copied using the <<copy-to,`copy_to`>> mapping parameter.
* Since the `_source` is stored as a single field in Lucene, the whole source
object must be loaded and parsed, even if only a small number of fields are
needed.

To avoid these limitations, you can:

* Use the <<docvalue-fields, `docvalue_fields`>>
parameter to get values for selected fields. This can be a good
choice when returning a fairly small number of fields that support doc values,
such as keywords and dates.
* Use the <<request-body-search-stored-fields, `stored_fields`>> parameter to get the values for specific stored fields. (Fields that use the <<mapping-store,`store`>> mapping option.)

You can find more detailed information on each of these methods in the
following sections:

* <<source-filtering>>
* <<docvalue-fields>>
* <<stored-fields>>

[discrete]
[[source-filtering]]
=== Source filtering

You can use the `_source` parameter to select what fields of the source are
returned. This is called _source filtering_.

.*Example*
[%collapsible]
Expand Down Expand Up @@ -91,23 +122,74 @@ GET /_search
----
====

Returning fields using only the document source has some limitations:

* The `_source` field does not include <<multi-fields, multi-fields>> or
<<alias, field aliases>>. Likewise, a field in the source does not contain
values copied using the <<copy-to,`copy_to`>> mapping parameter.
* Since the `_source` is stored as a single field in Lucene, the whole source
object must be loaded and parsed, even if only a small number of fields are
needed.
[discrete]
[[docvalue-fields]]
=== Doc value fields

{es} supports some alternative methods for returning fields that help avoid
these downsides:
You can use the <<docvalue-fields,`docvalue_fields`>> parameter to return
<<doc-values,doc values>> for one or more fields in the search response.

* The <<request-body-search-docvalue-fields, `docvalue_fields`>>
parameter allows for loading fields from their doc values. This can be a good
choice when returning a fairly small number of fields that support doc values,
such as keywords and dates.
* It's also possible to store an individual field's values by using the
Doc values store the same values as the `_source` but in an on-disk,
column-based structure that's optimized for sorting and aggregations. Since each
field is stored separately, {es} only reads the field values that were requested
and can avoid loading the whole document `_source`.

Doc values are stored for supported fields by default. However, doc values are
not supported for <<text,`text`>> or
{plugins}/mapper-annotated-text-usage.html[`text_annotated`] fields.

.*Example*
[%collapsible]
====
The following search request uses the `docvalue_fields` parameter to
retrieve doc values for the following fields:
* Fields with names starting with `my_ip`
* `my_keyword_field`
* Fields with names ending with `_date_field`
[source,console]
----
GET /_search
{
"query": {
"match_all": {}
},
"docvalue_fields": [
"my_ip*", <1>
{
"field": "my_keyword_field" <2>
},
{
"field": "*_date_field",
"format": "epoch_millis" <3>
}
]
}
----
<1> Wildcard patten used to match field names, specified as a string.
<2> Wildcard patten used to match field names, specified as an object.
<3> With the object notation, you can use the `format` parameter to specify a
format for the field's returned doc values. <<date,Date fields>> support a
<<mapping-date-format,date `format`>>. <<number,Numeric fields>> support a
https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html[DecimalFormat
pattern]. Other field datatypes do not support the `format` parameter.
====

TIP: You cannot use the `docvalue_fields` parameter to retrieve doc values for
nested objects. If you specify a nested object, the search returns an empty
array (`[ ]`) for the field. To access nested fields, use the
<<request-body-search-inner-hits, `inner_hits`>> parameter's `docvalue_fields`
property.


[discrete]
[[stored-fields]]
=== Stored fields

It's also possible to store an individual field's values by using the
<<mapping-store,`store`>> mapping option. You can use the
<<request-body-search-stored-fields, `stored_fields`>> parameter to return
<<request-body-search-stored-fields, `stored_fields`>> parameter to include
these stored values in the search response.
51 changes: 51 additions & 0 deletions docs/reference/search/search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,35 @@ If `true`, returns document version as part of a hit. Defaults to `false`.
[[search-search-api-request-body]]
==== {api-request-body-title}

[[search-docvalue-fields-param]]
`docvalue_fields`::
(Optional, array of strings and objects)
Array of wildcard (`*`) patterns. The request returns doc values for field names
matching these patterns in the `hits.fields` property of the response.
+
You can specify items in the array as a string or object.
See <<docvalue-fields>>.
+
.Properties of `docvalue_fields` objects
[%collapsible]
====
`field`::
(Required, string)
Wildcard pattern. The request returns doc values for field names matching this
pattern.
`format`::
(Optional, string)
Format in which the doc values are returned.
+
For <<date,date fields>>, you can specify a date <<mapping-date-format,date
`format`>>. For <<number,numeric fields>> fields, you can specify a
https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html[DecimalFormat
pattern].
+
For other field datatypes, this parameter is not supported.
====

[[request-body-search-explain]]
`explain`::
(Optional, boolean) If `true`, returns detailed information about score
Expand Down Expand Up @@ -533,6 +562,28 @@ Original JSON body passed for the document at index time.
+
You can use the `_source` parameter to exclude this property from the response
or specify which source fields to return.

`fields`::
+
--
(object)
Contains field values for the documents. These fields must be specified in the
request using one or more of the following request parameters:

* <<search-docvalue-fields-param,`docvalue_fields`>>
* <<request-body-search-script-fields,`script_fields`>>
* <<request-body-search-stored-fields,`stored_fields`>>

This property is returned only if one or more of these parameters are set.
--
+
.Properties of `fields`
[%collapsible%open]
======
`<field>`::
(array)
Key is the field name. Value is the value for the field.
======
=====
====

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ The configured weight for a suggestion is returned as `_score`. The
return the full document `_source` by default. The size of the `_source`
can impact performance due to disk fetch and network transport overhead.
To save some network overhead, filter out unnecessary fields from the `_source`
using <<search-fields, source filtering>> to minimize
using <<source-filtering, source filtering>> to minimize
`_source` size. Note that the _suggest endpoint doesn't support source
filtering but using suggest on the `_search` endpoint does:

Expand Down
2 changes: 1 addition & 1 deletion docs/reference/snapshot-restore/apis/put-repo-api.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ source-only snapshots that take up to 50% less space on disk.
+
Source-only snapshots are only supported if the <<mapping-source-field,`_source`
field>> is enabled and no
<<search-fields,source-filtering>> is applied.
<<source-filtering,source-filtering>> is applied.
+
WARNING: Source-only snapshots contain stored fields and index metadata. They do
not include index or doc values structures and are not searchable when restored.
Expand Down

0 comments on commit bfe7850

Please sign in to comment.