Skip to content

Commit ff13359

Browse files
committed
[DOCS] Addresses feedback on data frame intro. (#369)
This PR: * Addresses feedback. * Amends terminology.
1 parent a911747 commit ff13359

File tree

1 file changed

+17
-18
lines changed

1 file changed

+17
-18
lines changed

docs/en/stack/ml/dataframes.asciidoc

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3,45 +3,46 @@
33

44
beta[]
55

6-
A _{dataframe}_ is a transformation of a dataset by certain rules. You can think
7-
of it like a spreadsheet or a data table that makes your data ready to be analyzed
8-
and organized.
6+
A _{dataframe}_ is a transformation of data that has been indexed in elasticsearch.
7+
Use data frames to _pivot_ your data into a new entity centric index for example.
8+
By transforming and summarizing your data, it becomes possible to visualize and
9+
analyze it in alternative and interesting ways.
910

10-
A lot of {es} datasets are organized as a stream of events: each event is a individual
11+
A lot of {es} indices are organized as a stream of events: each event is an individual
1112
document, for example a single item purchase. {dataframe-transforms-cap} enable
1213
you to summarize this data, bringing it into an organized, more analysis friendly
1314
format. For example, you can summarize all the purchases of a single customer (see
1415
the example below).
1516

16-
The {dataframe} feature enables you to define a _pivot_ which is a set of features
17-
that transform the dataset into a different, more digestible format. Pivoting
18-
results in a summary of your dataset (which is the {dataframe} itself).
17+
The {dataframe} feature enables you to define a pivot which is a set of features
18+
that transform the index into a different, more digestible format. Pivoting
19+
results in a summary of your data (which is the {dataframe} itself).
1920

2021
Defining a pivot consist of two main parts. First, you select one or more fields
21-
that your dataset will be grouped by. Principally you can select categorical
22+
that your data will be grouped by. Principally you can select categorical
2223
fields (terms) for grouping. You can also select numerical fields, in this case,
2324
the field values will be bucketed using an interval you specify.
2425

2526
The second step is deciding how you want to aggregate the grouped data. When
26-
using aggregations, you practically ask questions about the dataset. There are
27+
using aggregations, you practically ask questions about the index. There are
2728
different types of aggregations, each with its own purpose and output. To learn
2829
more about the supported aggregations and group-by fields, see
2930
{ref}/data-frame-transform-pivot.html[Pivot resources].
3031

3132
As an optional step, it's also possible to add a query to further limit the
3233
scope of the aggregation.
3334

34-
IMPORTANT: In 7.2, you can build {dataframes} on the top of a static dataset.
35+
IMPORTANT: In 7.2, you can build {dataframes} on the top of a static indices.
3536
When new data comes into the index, you have to perform the transformation again
3637
on the altered data.
3738

3839
.Example
3940

40-
Imagine that you run a webshop that sells clothes. Every order creates a
41-
document that contains a unique order ID, the name and the category of the
42-
ordered product, its price, the ordered quantity, the exact date of the order,
43-
and some customer information (name, gender, location, etc). Your dataset
44-
contains all the transactions from last year.
41+
Imagine that you run a webshop that sells clothes. Every order creates a document
42+
that contains a unique order ID, the name and the category of the ordered product,
43+
its price, the ordered quantity, the exact date of the order, and some customer
44+
information (name, gender, location, etc). Your dataset contains all the transactions
45+
from last year.
4546

4647
If you want to check the sales in the different categories in your last fiscal year,
4748
define a {dataframe} that is grouped by the product categories (women's shoes, men's
@@ -53,6 +54,4 @@ shows the number of sold items in every product category in the last year.
5354
image::ml/images/ml-dataframepivot.jpg["Example of a data frame pivot in {kib}"]
5455

5556
IMPORTANT: Creating a {dataframe} leaves your source index intact. A new index will
56-
be created dedicated to the {dataframe}.
57-
58-
TIP: Using {dataframes} does not require {dfeeds}.
57+
be created dedicated to the {dataframe}.

0 commit comments

Comments
 (0)