Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add top/bottom docs. #1048

Merged
merged 10 commits into from
Jul 12, 2024
2 changes: 2 additions & 0 deletions search/processingmodules.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ strings <strings/strings>
subnet <subnet/subnet>
taint <taint/taint>
time <time/time>
top/bottom <topbottom/topbottom>
transaction <transaction/transaction>
Trim Modules (list) <trim/trim>
truncate <truncate/truncate>
Expand Down Expand Up @@ -139,6 +140,7 @@ words <words/words>
* [subnet](subnet/subnet) - extract & filter based on IP subnets.
* [taint](taint/taint) - taint tracking.
* [time](time/time) - convert strings to time enumerated values, and vice versa.
* [top/bottom](topbottom/topbottom) - display the top/bottom N values.
* [transaction](transaction/transaction) - group multiple entries into single-entry "transactions" based on keys.
* [Trim Modules (list)](trim/trim) - various string trim functions.
* trim - remove leading and trailing Unicode code points.
Expand Down
76 changes: 76 additions & 0 deletions search/topbottom/topbottom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Top / Bottom Modules

The `top` and `bottom` modules show the top or bottom N values of a given enumerated value (or values) or entry. For example, to show the top 10 values of the enumerated value `foo`:

```gravwell
tag=foo json foo | top -n 10 foo
```

The output is sorted in descending order for the `top` module and in ascending order for the `bottom` module.

The modules use the same syntax. Each optionally takes one or more enumerated values. The modules will emit the top/bottom values of the given value. If no enumerated value is provided, the `DATA` field is used.
If multiple enumerated values are provided, the modules will sort by the top/bottom values of the enumerated values in the provided order. If two entries have the same value for a given enumerated value, the second enumerated value will be used.

For example, given:

| foo | bar |
|-----|-----|
| 1 | 1 |
| 10 | 9 |
| 10 | 200 |
| 10 | 100 |

and the query:

```gravwell
tag=foo json foo bar | top -n 2 foo bar
```

The output will be:

| foo | bar |
|-----|-----|
| 10 | 200 |
| 10 | 100 |

In the example above, three of the four original entries all have the same value 10 for `foo`. In order to find the top 2 as requested in the query, `top` then looks at the values of `bar`, and uses the top values 200 and 100 to determine which two entries to keep.

Values must be numeric or able to be cast to a number. Non-numeric values are ignored.

The top and bottom modules are functionally equivalent to sorting with a limit. For example:

```gravwell
tag=foo json foo | top -n 10 foo
```

will produce the same result as

```gravwell
tag=foo json foo | sort by foo desc | limit 10
```

The `top` and `bottom` modules are however far more performant than using sort/limit.

## Flags

- `-n <number>`: The "-n" flag specifies the number of entries to return. The default is 10.

## Examples

To get the top 10 values of the DATA field:

```gravwell
tag=gravwell top
```

To get the top 10 values of foo:

```gravwell
tag=gravwell json foo | top foo
```

To get the bottom 300 values of foo, and where foo is equal, bar:

```gravwell
tag=gravwell json foo bar | bottom -n 300 foo bar
```