Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

we need a basic perf test in ZOS as primitive #1454

Closed
Tracked by #1470
despiegk opened this issue Aug 25, 2023 · 7 comments
Closed
Tracked by #1470

we need a basic perf test in ZOS as primitive #1454

despiegk opened this issue Aug 25, 2023 · 7 comments
Assignees
Milestone

Comments

@despiegk
Copy link
Contributor

despiegk commented Aug 25, 2023

Performance Tests to do

  • every 5 min do (for network)
    • 3x ping to 12 chosen points (well chosen points in world), make avg
      • results in 12 results
    • get 12 random public nodes from our grid ipv4
      • do an ipperf to them one over udp, one over tcp, small test
      • results in 24 (or more) results, gives us a good idea of how the network will be connected
      • this means ipperf needs to be accessible over ipv4 (this might give issues in future, for now ok)
    • get 12 random public nodes from our grid ipv6, same tests other network
  • every 6h
    • 2x perf test, download/upload 1MB file to 3 high performance points, remember the avg of the 2 tests
      • results in 3 results
    • do a cpu measure test, just to have our own metrics around what is performance
      • didn't maxux have something
      • will ofcourse be influences by workloads already running
    • report on mem used, cpu used, disk used, ... we prob already have all this minor things
  • the results of this go in nice json and is cached in mem for when someone asks for it
  • we sign the results with the TPM chip and we also provide the pub key, so we can guarantee its coming from this machine
    • if no TPM then we need to do something else

RMB functions we can ask for

  • performance_stats_get()
    • gets the performance from the cache
  • pubkey_get()
    • this allows us to test the result as returned from the performance monitor
@despiegk despiegk transferred this issue from threefoldtech/zos Aug 27, 2023
@muhamadazmy muhamadazmy self-assigned this Aug 28, 2023
@muhamadazmy muhamadazmy moved this from Accepted to In Progress in 3.12.x Aug 30, 2023
@ramezsaeed ramezsaeed mentioned this issue Sep 25, 2023
27 tasks
@muhamadazmy
Copy link
Member

Since results are fetched over rmb, the results are automatically signed by the node keys hence verified (without tpm)

@muhamadazmy muhamadazmy moved this from In Progress to In Verification in 3.12.x Oct 16, 2023
@khaledyoussef24
Copy link
Contributor

khaledyoussef24 commented Oct 23, 2023

Results are automatically updated and node performance is being monitored with a cron job and task time for the job may be changed manually also.

@xmonader
Copy link
Contributor

This needs another verification @ramezsaeed
threefoldtech/zos#2111
threefoldtech/zos#2106
threefoldtech/zos#2113
threefoldtech/zos#2112
threefoldtech/zos#2114

@ramezsaeed ramezsaeed added this to the 3.12 milestone Nov 22, 2023
@ramezsaeed ramezsaeed moved this from Done to In Verification in 3.12.x Nov 27, 2023
@muhamadazmy
Copy link
Member

  • For the network test right now we:
    • Find public nodes as per spec
    • Run iperf speed tests (upload/download) every 6hours
  • Test the usability of the farm public IP4 every 7 hours
  • Run CPU tests every 6hours

We need to also:

  • Run faster pint tests every 30min

@khaledyoussef24
Copy link
Contributor

khaledyoussef24 commented Nov 27, 2023

  • verified from our side
  • Name: public-ip-validation
  • Schedule: 4 times a day
"185.206.122.50/24": {
        "reason": "ip is already assigned to a contract",
        "state": "skipped"
      },
      "185.206.122.52/24": {
        "reason": "",
        "state": "valid"
      },

Screenshot from 2023-11-27 15-10-28
Screenshot from 2023-11-27 15-06-33

iperf test runs

cpu testing

{
    "description": "Measures the performance of the node CPU by reporting the timespent of computing a task in seconds.",
    "name": "cpu-benchmark",
    "result": {
      "multi": 60.806,
      "single": 2.331,
      "threads": 24,
      "workloads": 64
    },

iperf testing

"description": "Test public nodes network performance with both UDP and TCP over IPv4 and IPv6",
    "name": "iperf",
    "result": [
      {
        "cpu_report": {
          "host_system": 0,
          "host_total": 0,
          "host_user": 0,
          "remote_system": 0,
          "remote_total": 0,
          "remote_user": 0
        },
        "download_speed": 0,
        "error": "the server is busy running a test. try again later",
        "node_id": 124,
        "node_ip": "88.99.30.200",
        "test_type": "tcp",
        "upload_speed": 0
      },

if the server is busy download speed is 0

command used zos.perf.get_all
we may get the results for only one task also by using the task name

https://app.testlodge.com/a/26076/projects/40893/suites/234919?expand_section=420793#case_4137370

@ramezsaeed ramezsaeed moved this from In Verification to In Progress in 3.12.x Nov 28, 2023
@ramezsaeed
Copy link
Contributor

ramezsaeed commented Nov 29, 2023

Basic feature verified and working fine now.
will be continue all the reported issues and some test results enhancements in 3.13 release..

@github-project-automation github-project-automation bot moved this from In Progress to Done in 3.12.x Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

5 participants