Initial new profiling work #74

blt · 2021-03-10T22:24:59Z

This commit introduces a new performance test file_to_blackhole. What we have
here is not ready for prime time yet, but the basic stubs of a repeatable
process are in place. We have indications that users are struggling with file
source capped topologies keeping up, which is why I've dropped into blackhole
versus into an http sink. As much as possible this work focuses on file source
alone.

Throughput falls off pretty rapidly as line sizes increase, looks like. See
notes.md in this commit for details.

Signed-off-by: Brian L. Troutwine brian@troutwine.us

This commit introduces a new performance test `file_to_blackhole`. What we have here is not ready for prime time yet, but the basic stubs of a repeatable process are in place. We have indications that users are struggling with file source capped topologies keeping up, which is why I've dropped into blackhole versus into an http sink. As much as possible this work focuses on file source alone. Throughput falls off pretty rapidly as line sizes increase, looks like. See `notes.md` in this commit for details. Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

blt · 2021-03-10T22:25:56Z

cases/performance/file_to_blackhole/notes.md

+| 10 | 25 Kb/sec | 5 Kb | 250 Kb/sec | 250 Kb/sec | 267 Kb/sec | 0 Kb/sec |
+| 10 | 250 Kb/sec | 5 Kb | 2.5 Mb/sec | 2.5 Mb/sec | 2.68 Mb/sec | 0 Mb/sec |
+| 10 | 2.5 Mb/sec | 5 Kb | 25 Mb/sec | 25 Mb/sec | 26.8 Mb/sec | 0 Mb/sec |
+| 10 | 2.5 Mb/sec | 10 Kb | 25 Mb/sec | 25 Mb/sec | 26.2 Mb/sec | 0 Mb/sec |
+| 10 | 2.5 Mb/sec | 100 Kb | 25 Mb/sec | 25 Mb/sec | 25.7 Mb/sec | 0 Mb/sec |
+| 10 | 2.5 Mb/sec | 100 Kb | 25 Mb/sec | 25 Mb/sec | 25.7 Mb/sec | 0 Mb/sec |
+| 10 | 2.5 Mb/sec | 105 Kb | 25 Mb/sec | 23.9 Mb/sec | 24.6 Mb/sec | 1.1 Mb/sec |
+| 10 | 2.5 Mb/sec | 110 Kb | 25 Mb/sec | 22 Mb/sec | 22.06 Mb/sec | 3 Mb/sec |
+| 10 | 2.5 Mb/sec | 125 Kb | 25 Mb/sec | 17 Mb/sec | 17.04 Mb/sec | 9 Mb/sec |
+| 10 | 2.5 Mb/sec | 150 Kb | 25 Mb/sec | 12.3 Mb/sec | 12.6 Mb/sec | 12.7 Mb/sec |
+| 10 | 2.5 Mb/sec | 200 Kb | 25 Mb/sec | 6 Mb/sec | 6.15 Mb/sec | 19 Mb/sec |
+| 10 | 2.5 Mb/sec | 300 Kb | 25 Mb/sec | 3 Mb/sec | 3.08 Mb/sec | 22 Mb/sec |
+| 10 | 2.5 Mb/sec | 400 Kb | 25 Mb/sec | 1.53 Mb/sec | 1.57 Mb/sec | 23.47 Mb/sec |
+| 10 | 2.5 Mb/sec | 500 Kb | 25 Mb/sec | 1.05 Mb/sec | 1.07 Mb/sec | 23.95 Mb/sec |
+| 10 | 2.5 Mb/sec | 900 Kb | 25 Mb/sec | 231 Kb/sec | 237 Kb/sec | 24.77 Mb/sec |


@lukesteensen @binarylogic this is probably the most relevant chunk for our ongoing conversation.

blt · 2021-03-10T22:36:44Z

@tsantero this is also potentially relevant to what you folks were experiencing.

blt · 2021-03-10T23:23:03Z

Relevant flamegraphs: https://gist.github.com/blt/87b90377ae8d8c6e8a08e28c6c437037

The size in the file names refers to the line size that file_gen produces.

blt · 2021-03-10T23:24:04Z

The diff doesn't mean much. The two flamegraphs for each line size are pretty radically different, which in itself is interesting.

binarylogic · 2021-03-11T01:12:08Z

That was fast, very nice. @lukesteensen has a much better understanding of the file source performance characteristics. I'm curious what he thinks.

blt · 2021-03-11T02:15:37Z

Worth calling out that this is not necessarily THE problem we’ve been working on, but it is a problem.

…

On Wed, Mar 10, 2021, at 17:12, Binary Logic wrote: That was fast, very nice. @lukesteensen <https://github.com/lukesteensen> has a much better understanding of the file source performance characteristics. I'm curious what he thinks. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#74 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAA36YOQ45E2BA6GTPBQN3TDADGLANCNFSM4Y7ACYHA>.

tsantero · 2021-03-11T16:15:25Z

@blt thanks for tagging me here. I'll be able to spend some time with this tonight and report back anything interesting.

blt · 2021-03-11T17:59:39Z

@tsantero rad, I appreciate it. If you pull 0.12 and flip debug logs on you'll get utilization numbers -- vectordotdev/vector#6634 -- that'll show us which part of your topology is consuming the most time.

lukesteensen

This looks great! Really happy to see this progressing.

lukesteensen · 2021-03-12T01:42:06Z

cases/performance/file_to_blackhole/vector.toml

+
+[sources.file]
+type = "file"
+max_line_bytes = 102400


Unless I'm missing something, I think this should explain why we see a drop-off right at 100kb line size 😄 . If we add another zero it should align better with the generated data.

That being said, I'm not terribly worried about extremely large individual lines. My assumption is that 100kb is a reasonable upper limit for "normal" use cases.

D'oh, good eye. I thought I had that set to 1MB.

Well, that's good, correcting the line size actually makes the results I'm seeing more explicable. I'll add a section to the notes tomorrow.

I can for sure produce a situation where vector falls behind and never catches up and this happens cleanly without my max_line_bytes bug muddying the waters. If I have file_gen 10 files at 500Kb line bytes and adjust max_line_bytes to actually be 1MB the effect still exists, except now the throughput from file source is 5MB where I'd expect 25MB. Likewise if file_gen produces 255 files at 100Kb line bytes the throughput from the file source is ~460Mb/s where file_gen is producing at ~630Mb/s.

If I adjust file_gen to produce 255 10Kb line files the numbers look like 600Mb/s from file_gen and 400Mb/s from the file source. Excellent result. I'll produce a clearer table tomorrow, but these numbers are promising.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

blt · 2021-03-12T20:03:17Z

cases/performance/file_to_blackhole/notes.md

+### Source Exploration
+
+The main entry point for the file source is `src/sources/file.rs` function
+`file_source`. This wires up user config, data directory and out
+`Pipeline`. Interior return is a boxed async move, `FileServer` from
+`file_server` crate is setup as a
+[`spawn_blocking`](https://docs.rs/tokio/1.3.0/tokio/task/fn.spawn_blocking.html). Possible
+that `block_in_place` would be more ideal but realistically we'd be async all
+the way down. `FileServer` is adapted from
+[cernan](https://github.com/postmates/cernan/tree/master/src/source/file) code,
+notably written before Rust async's story was ideal. That said, I do not find it
+explicable that this would be the reason for fall-off once line lengths go
+beyond a certain limit, absent other indications of blocking.
+
+## 2021/03/12
+
+As noted by Luke Steensen the drop-off above 100Kb is partially explicable owing
+the a bug in the vector config. I had intended the `max_line_bytes` setting to
+be 1MB but it was set to 100KB. Vector config is now adjusted to:
+
+```toml
+data_dir = "data"
+
+[sources.internal]
+type = "internal_metrics"
+
+[sources.file]
+type = "file"
+max_line_bytes = 1024000
+fingerprint.strategy = "device_and_inode"
+include = ["logs/*.log"]
+
+[sinks.prometheus]
+type = "prometheus_exporter"
+inputs = ["internal"]
+address = "0.0.0.0:9001"
+
+[sinks.blackhole]
+type = "blackhole"
+inputs = ["file"]
+```
+
+New results, `bytes_per_second` held to 2.5 Mb/sec:
+
+| `duplicates` | `bytes_per_second ` |  `maximum_line_size_bytes` | `file_gen` bytes/sec throughput | file source `sum(rate(vector_processed_bytes_total[1m]))` | delta `file_gen`, `file` |
+| ------------ | ------------------- | ------------------- | ------------------------------- | --------------------------------------------------------- | ------------------------ |
+| 10           | 2.5 Mb/sec           | 10 Kb/sec           | 25 Mb/sec             | 25 Mb/sec            | 0 Kb/sec                |
+| 10           | 25 Mb/sec            | 10 Kb/sec           | 246 Mb/sec            | 219 Mb/sec           | 27 Mb/sec               |
+| 10           | 50 Mb/sec            | 10 Kb/sec           | 475 Mb/sec            | 221 Mb/sec           | 254 Mb/sec              |
+| 10           | 2.5 Mb/sec           | 100 Kb/sec          | 25 Mb/sec             | 25 Mb/sec            | 0 Kb/sec                |
+| 50           | 2.5 Mb/sec           | 10 Kb/sec           | 125 Mb/sec            | 125 Mb/sec           | 0 Mb/sec                |
+| 75           | 2.5 Mb/sec           | 10 Kb/sec           | 187 Mb/sec            | 187 Mb/sec           | 0 Mb/sec                |
+| 85           | 2.5 Mb/sec           | 10 Kb/sec           | 212 Mb/sec            | 212 Mb/sec           | 0 Mb/sec                |
+| 90           | 2.5 Mb/sec           | 10 Kb/sec           | 225 Mb/sec            | 222 Mb/sec           | 3 Mb/sec                |
+| 95           | 2.5 Mb/sec           | 10 Kb/sec           | 237 Mb/sec            | 216 Mb/sec           | 19 Mb/sec               |
+| 100          | 2.5 Mb/sec           | 10 Kb/sec           | 250 Mb/sec            | 221 Mb/sec           | 29 Mb/sec               |
+| 255          | 2.5 Mb/sec           | 10 Kb/sec           | 616 Mb/sec            | 222 Mb/sec           | 394 Mb/sec              |
+| 510          | 100 Kb/sec           | 1 Kb/sec            | 48 Mb/sec             | 40 Mb/sec            | 8 Mb/sec                |
+
+One thing stood out in the tests: if I allowed vector to develop a backlog and
+then stopped `file_gen` the throughput would gradually catch up, before
+inevitably falling to zero when all the bytes were processed. There are two
+dimensions that cause the file source to fall behind, based on the above table:
+total number of files and byte throughput of files. 510 files of relatively
+meager throughput -- 48 Mb/sec -- forced file source to fall behind by 8 Mb/sec,
+a relatively small number of files -- 10 -- producing at 50Mb/sec caused the
+file source to fall behind by 254 Mb/sec. The progenitor code for `file-source`
+was concerned with fairness and not throughput, which I believe shows today in
+the data.


@lukesteensen new results with the config bug corrected. Thanks again for catching that.

This case intends to demonstrate the difference between a multi-hop remap config and a single hop. The remap program in vector-nohop has problems but the basic idea is here. Once the nohop is corrected we should also generate a differential flamegraph. Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

tobz

Nothing egregious jumps out here.

As discussed in chat, we know this is a first pass, originally designed for manual benchmarking, so we can continue to refine, standardize, etc.

blt requested review from lukesteensen and binarylogic March 10, 2021 22:24

blt commented Mar 10, 2021

View reviewed changes

blt requested a review from jszwedko March 10, 2021 22:26

blt mentioned this pull request Mar 11, 2021

File source has suboptimial performance degredation as line sizes increase vectordotdev/vector#6730

Open

lukesteensen reviewed Mar 12, 2021

View reviewed changes

Expand notes further.

db88932

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

blt commented Mar 12, 2021

View reviewed changes

blt mentioned this pull request Mar 12, 2021

chore(performance): file-source testing and benchmarks vectordotdev/vector#6742

Merged

blt added 4 commits March 15, 2021 12:05

Fix config

07f0c6d

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

Update notes, capture ongoing work

0213fe1

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

latest updates

d1d44a5

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>

blt requested a review from tobz May 20, 2021 18:26

blt marked this pull request as ready for review May 20, 2021 18:27

tobz approved these changes May 20, 2021

View reviewed changes

blt merged commit 37d17c3 into vectordotdev:master May 20, 2021

blt deleted the blt-new_perf branch May 20, 2021 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial new profiling work #74

Initial new profiling work #74

blt commented Mar 10, 2021

blt Mar 10, 2021

blt commented Mar 10, 2021

blt commented Mar 10, 2021

blt commented Mar 10, 2021

binarylogic commented Mar 11, 2021

blt commented Mar 11, 2021 via email

tsantero commented Mar 11, 2021

blt commented Mar 11, 2021

lukesteensen left a comment

lukesteensen Mar 12, 2021

blt Mar 12, 2021

blt Mar 12, 2021 •

edited

Loading

blt Mar 12, 2021

blt Mar 12, 2021

tobz left a comment

Initial new profiling work #74

Initial new profiling work #74

Conversation

blt commented Mar 10, 2021

blt Mar 10, 2021

Choose a reason for hiding this comment

blt commented Mar 10, 2021

blt commented Mar 10, 2021

blt commented Mar 10, 2021

binarylogic commented Mar 11, 2021

blt commented Mar 11, 2021 via email

tsantero commented Mar 11, 2021

blt commented Mar 11, 2021

lukesteensen left a comment

Choose a reason for hiding this comment

lukesteensen Mar 12, 2021

Choose a reason for hiding this comment

blt Mar 12, 2021

Choose a reason for hiding this comment

blt Mar 12, 2021 • edited Loading

Choose a reason for hiding this comment

blt Mar 12, 2021

Choose a reason for hiding this comment

blt Mar 12, 2021

Choose a reason for hiding this comment

tobz left a comment

Choose a reason for hiding this comment

blt Mar 12, 2021 •

edited

Loading