Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rally benchmark kubernetes.pod #8443

Merged
merged 8 commits into from
Nov 21, 2023

Conversation

aspacca
Copy link
Contributor

@aspacca aspacca commented Nov 9, 2023

Enhancement

Proposed commit message

Add artifacts for elastic-package rally benchmark

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
    - [ ] I have added an entry to my package's changelog.yml file.
    - [ ] I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • [ ]

How to test this PR locally

From kubernets package root (remember to bring up the elastic-package stack before):
./elastic-package benchmark rally --benchmark pod-benchmark -v

Related issues

Screenshots

--- Benchmark results for package: kubernetes - START ---
╭──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ info                                                                                             │
├────────────────────────┬─────────────────────────────────────────────────────────────────────────┤
│ benchmark              │                                                           pod-benchmark │
│ description            │                          Benchmark 20000 kubernetes.pod events ingested │
│ run ID                 │                                    0b2e5e14-cb99-40ad-89ce-75a8efc88700 │
│ package                │                                                              kubernetes │
│ start ts (s)           │                                                              1699527935 │
│ end ts (s)             │                                                              1699527968 │
│ duration               │                                                                     33s │
│ generated corpora file │ /Users/andreaspacca/.elastic-package/tmp/rally_corpus/corpus-2032669756 │
╰────────────────────────┴─────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────╮
│ parameters                                                        │
├─────────────────────────────────┬─────────────────────────────────┤
│ package version                 │                          1.53.0 │
│ data_stream.name                │                             pod │
│ corpora.generator.total_events  │                           20000 │
│ corpora.generator.template.path │ ./pod-benchmark/template.ndjson │
│ corpora.generator.template.raw  │                                 │
│ corpora.generator.template.type │                          gotext │
│ corpora.generator.config.path   │      ./pod-benchmark/config.yml │
│ corpora.generator.config.raw    │                           map[] │
│ corpora.generator.fields.path   │      ./pod-benchmark/fields.yml │
│ corpora.generator.fields.raw    │                           map[] │
╰─────────────────────────────────┴─────────────────────────────────╯
╭───────────────────────╮
│ cluster info          │
├───────┬───────────────┤
│ name  │ elasticsearch │
│ nodes │             1 │
╰───────┴───────────────╯
╭────────────────────────────────────────────────────────╮
│ data stream stats                                      │
├────────────────────────────┬───────────────────────────┤
│ data stream                │ metrics-kubernetes.pod-ep │
│ approx total docs ingested │                     20000 │
│ backing indices            │                         1 │
│ store size bytes           │                  37748933 │
│ maximum ts (ms)            │             1699531530064 │
╰────────────────────────────┴───────────────────────────╯
╭───────────────────────────────────────╮
│ disk usage for index .ds-metrics-kube │
│ rnetes.pod-ep-2023.11.09-000001 (for  │
│ all fields)                           │
├──────────────────────────────┬────────┤
│ total                        │  15 MB │
│ inverted_index.total         │ 4.1 MB │
│ inverted_index.stored_fields │ 5.3 MB │
│ inverted_index.doc_values    │ 5.0 MB │
│ inverted_index.points        │ 332 kB │
│ inverted_index.norms         │    0 B │
│ inverted_index.term_vectors  │    0 B │
│ inverted_index.knn_vectors   │    0 B │
╰──────────────────────────────┴────────╯
╭──────────────────────────────────────────────────────────────────────────────────╮
│ pipeline metrics-kubernetes.pod-1.53.0 stats in node -u0THNwnRLeH1Qslb_aclw      │
├──────────────────────────────────────────┬───────────────────────────────────────┤
│ Totals                                   │ Count: 20000 | Failed: 0 | Time: 10ms │
│ pipeline (metrics-kubernetes.pod@custom) │  Count: 20000 | Failed: 0 | Time: 3ms │
╰──────────────────────────────────────────┴───────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────────╮
│ rally stats                                                                                 │
├────────────────────────────────────────────────────────────────┬────────────────────────────┤
│ Cumulative indexing time of primary shards                     │     0.3889666666666667 min │
│ Min cumulative indexing time across primary shards             │                      0 min │
│ Median cumulative indexing time across primary shards          │   0.006458333333333333 min │
│ Max cumulative indexing time across primary shards             │    0.06888333333333334 min │
│ Cumulative indexing throttle time of primary shards            │                      0 min │
│ Min cumulative indexing throttle time across primary shards    │                      0 min │
│ Median cumulative indexing throttle time across primary shards │                    0.0 min │
│ Max cumulative indexing throttle time across primary shards    │                      0 min │
│ Cumulative merge time of primary shards                        │   0.021766666666666667 min │
│ Cumulative merge count of primary shards                       │                         49 │
│ Min cumulative merge time across primary shards                │                      0 min │
│ Median cumulative merge time across primary shards             │  0.0004583333333333333 min │
│ Max cumulative merge time across primary shards                │  0.0021833333333333336 min │
│ Cumulative merge throttle time of primary shards               │                      0 min │
│ Min cumulative merge throttle time across primary shards       │                      0 min │
│ Median cumulative merge throttle time across primary shards    │                    0.0 min │
│ Max cumulative merge throttle time across primary shards       │                      0 min │
│ Cumulative refresh time of primary shards                      │                 0.0678 min │
│ Cumulative refresh count of primary shards                     │                       1753 │
│ Min cumulative refresh time across primary shards              │                      0 min │
│ Median cumulative refresh time across primary shards           │  0.0013416666666666668 min │
│ Max cumulative refresh time across primary shards              │   0.015716666666666667 min │
│ Cumulative flush time of primary shards                        │     0.9700000000000001 min │
│ Cumulative flush count of primary shards                       │                       1575 │
│ Min cumulative flush time across primary shards                │ 3.3333333333333335e-05 min │
│ Median cumulative flush time across primary shards             │               0.022075 min │
│ Max cumulative flush time across primary shards                │    0.07748333333333333 min │
│ Total Young Gen GC time                                        │                     0.04 s │
│ Total Young Gen GC count                                       │                         14 │
│ Total Old Gen GC time                                          │                        0 s │
│ Total Old Gen GC count                                         │                          0 │
│ Store size                                                     │     0.05837011896073818 GB │
│ Translog size                                                  │   0.0002489052712917328 GB │
│ Heap used for segments                                         │                       0 MB │
│ Heap used for doc values                                       │                       0 MB │
│ Heap used for terms                                            │                       0 MB │
│ Heap used for norms                                            │                       0 MB │
│ Heap used for points                                           │                       0 MB │
│ Heap used for stored fields                                    │                       0 MB │
│ Segment count                                                  │                        466 │
│ Total Ingest Pipeline count                                    │                      20019 │
│ Total Ingest Pipeline time                                     │                    1.046 s │
│ Total Ingest Pipeline failed                                   │                          0 │
│ Min Throughput                                                 │            23290.68 docs/s │
│ Mean Throughput                                                │            23290.68 docs/s │
│ Median Throughput                                              │            23290.68 docs/s │
│ Max Throughput                                                 │            23290.68 docs/s │
│ 50th percentile latency                                        │       780.1949169999994 ms │
│ 100th percentile latency                                       │      1200.2340830000016 ms │
│ 50th percentile service time                                   │       780.1949169999994 ms │
│ 100th percentile service time                                  │      1200.2340830000016 ms │
│ error rate                                                     │                     0.00 % │
╰────────────────────────────────────────────────────────────────┴────────────────────────────╯

--- Benchmark results for package: kubernetes - END   ---
Done

@aspacca aspacca self-assigned this Nov 9, 2023
@aspacca aspacca requested review from a team as code owners November 9, 2023 11:24
@elasticmachine
Copy link

elasticmachine commented Nov 9, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-11-21T01:07:56.095+0000

  • Duration: 56 min 19 sec

Test stats 🧪

Test Results
Failed 0
Passed 97
Skipped 0
Total 97

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Nov 9, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (1/1) 💚
Classes 100.0% (1/1) 💚
Methods 96.386% (80/83) 👎 -3.614
Lines 100.0% (22/22) 💚 7.899
Conditionals 100.0% (0/0) 💚

"node":{
"uid": "{{ $uId }}" ,
"hostname":"host-{{ $nodeid }}",
"name":"host-{{ $nodeid }}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me understand a thing. If you run the tool to generate let's say 1000 events, then there will be 90 different node names in these events. But there will be 1000? different node uids ? So the the same nodes(with same name) will have a different uid. Is this assumption correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the assets from @gizas elastic/elastic-integration-corpus-generator-tool#111
but yes, according to the config this will be correct.

if need to have 90 $nodeId and 90 $uId and they always must match we had to add cardinality: 90 to both of them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cardinality doesn't it mean that they will just be 90 different ? How can we know that they will match? Or we could directly link the name with the uid by setting the name:host-{{uid}}.

The reason I am mentioning this, is that if the uid and name don't match it cannot actually be a realistic test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it affect the test if we just remove one of the fields to avoid this scenario?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but i agree, we are going to come up against these types of 'correlated' fields all the time. we should probably find a way to deal with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cardinality doesn't it mean that they will just be 90 different ? How can we know that they will match?

They will match because cardinal values are generated sequentially and reiterated at the same time:

cardinality: 3
value1
value2
value3

Or we could directly link the name with the uid by setting the name:host-{{uid}}.

since uid is a call to a function, we cannot apply cardinality indeed (it's not a field defined basically, sorry for the initial suggestion we could apply it).

we could still have name:host-{{uid}} but we will end up with as many host as number of events generated.

The reason I am mentioning this, is that if the uid and name don't match it cannot actually be a realistic test.

uid must a uid, must it? (sorry for the dump question).
we could make it a field in the format word1-word2-word3-word4 and make name and hostname be a "link" to it.

having uid as a field we could apply a cardinality

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the game but trying to catch up.
Summarising to the above:

1)For now the number of nodes did not affect the test.
2) The assumption we made is that the specific test will produce a result of around 10.000 unique pod names and cosidering that we have the limit of 110 pods per node, that is why we resulted to 90 nodes. (See https://github.com/elastic/elastic-integration-corpus-generator-tool/pull/111/files#diff-d34fb1cf4866915a96be5b7ef33896bfecb1d98d1571dc32204396cc8aef5255R38)
So I would say to keep it like that for now
3) We have the same problem with pod.name and pod.uid. We need to correlate specific pod names with pod.uids and the problem there afffects the tests because uids are dimensions https://github.com/elastic/rally-tracks/blob/master/tsdb_k8s_queries/pod-template.json#L663. I will update pod.name to be the a "link" to pod.uid, is simpler and should work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am running some more tests and will update the uids shortly.

To add one above:
See below block that will produce 10000 pod names (based on rangeofid) and will distribute to 90 nodes (based on limit we have that 1node has 110 maximum pods)

- name: rangeofid
    range:
      min: 0
      max: 10000
  - name: nodeid
    range:
      min: 1
      max: 90

Then the key is that we produce the result with gotext -t 8640000, which is the number of events when we scrape every 10 sec.
See info

So the ideal would be based on the number of events (and the user gives eg number of pods to calculate number of nodes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aspacca and @gizas I believe the having each generated pod event having a unique node uid does not make sense. We need to apply cardinality. I don't know why uid needs to be a real uid in these tests.

we could make it a field in the format word1-word2-word3-word4 and make name and hostname be a "link" to it.

This makes sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aspacca , @MichaelKatsoulis here is my update template where I removed nodeid and just i divide the rangeofid of pod with 110. This gives me a fix node id for specific group of pods
elastic/elastic-integration-corpus-generator-tool@fe73f69

This way 0-109 pods are inside node-0
110-219 node-1 and goes one!

@gizas gizas mentioned this pull request Nov 14, 2023
2 tasks
@aspacca
Copy link
Contributor Author

aspacca commented Nov 16, 2023

@gizas all good here? :)

I've applied the latest change related to node uid

@aspacca aspacca merged commit 9c5cd35 into elastic:main Nov 21, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants