Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Update Reference Hardware Specs #13317

Merged
merged 4 commits into from
Jun 28, 2023
Merged

Conversation

ggwpez
Copy link
Member

@ggwpez ggwpez commented Feb 6, 2023

Updating to the new hardware specs. CPU got slower but disk faster. This was the trade-off for chosing Cloud VM machines.
The new numbers were generated on a Cloud reference server with:

./target/production/substrate benchmark machine --dev --memory-duration 60 --verify-duration 60 --hash-duration 60 --disk-duration 60

Value changes:

Bench Old New Relative Change Unit
Blake2256 1029.0 783.27 -23.88 % MiBs
Sr25519Verify 666 560.67 -15.82 % MiBs
Memcopy 14.323 11.49 -19.78 % GiBs
Disk Seq Write 450 950 +111.11 % MiBs
Disk Rnd Write 200 420 +110.0 % MiBs

I rounded the new disk speed down a bit (971->950, 445->420) since the disk benches are known to not be as consistent as the CPU ones.

Closes #13308. Marking as noteworthy so this is mentioned in the change log.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
@github-actions github-actions bot added the A0-please_review Pull request needs code review. label Feb 6, 2023
@ggwpez ggwpez added B5-clientnoteworthy D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit labels Feb 6, 2023
@99Kies
Copy link

99Kies commented Feb 6, 2023

@ggwpez Disk Seq Write and Disk Rnd Write values mainly affect which part of the function, my current Disk Seq Write's Disk Rnd Write values are 290.10 MiBs and 125.07 MiBs respectively, can I go run polkadot's verifier node.

@ggwpez
Copy link
Member Author

ggwpez commented Feb 6, 2023

Disk Seq Write and Disk Rnd Write values mainly affect which part of the function, my current Disk Seq Write's Disk Rnd Write values are 290.10 MiBs and 125.07 MiBs respectively, can I go run polkadot's verifier node.

You are probably using a network drive and not a NVMe SSD disk as mentioned in the wiki. In AWS it is probably possible to use multiple disks like in GCP. We should extend the explanations in the wiki for that. In GCP I used 4 network disks to archive performance close to a local disk. cc @bakhtin
Polkadot will update to use the same stats as soon as this merges.

@bkchr
Copy link
Member

bkchr commented Feb 6, 2023

Why do we need faster disks? Instead of a faster CPU?

@ggwpez
Copy link
Member Author

ggwpez commented Feb 6, 2023

Why do we need faster disks? Instead of a faster CPU?

It would be nice to have faster CPU as well, but the most widely available cloud hosters only offer server CPUs, like Xeon and EPYC. These are mostly inferior in single-thread speed when compared to consumer hardware like Intel i7 or i9.
That is the trade-off for allowing machines to be virtual instead of bare metal.
Most validators used VMs anyway despite the old bare-metal recommendation, so this should help them to pick good VMs.

@bkchr
Copy link
Member

bkchr commented Feb 6, 2023

Okay, ty for the explanation!

@ggwpez ggwpez mentioned this pull request Feb 8, 2023
@99Kies
Copy link

99Kies commented Feb 8, 2023

image

@ggwpez Disk Seq Write's Disk Rnd Write values are 290.10 MiBs and 125.07 MiBs.

My main confusion is, can I use this configuration to run polkadot/kusama validator-node. I mainly want to avoid slashing my assets, which should be related to the outgoing blocks of the node, so as long as I have the CPU and Mem up to par, is it possible to run the node. (And then I don't have to worry about my assets getting slashed?


And I think we should offer three configurations, minimum, medium and maximum. It will makes it easier for validator to select the most cost effective machine.🤔

@ggwpez
Copy link
Member Author

ggwpez commented Feb 8, 2023

My main confusion is, can I use this configuration to run polkadot/kusama validator-node. I mainly want to avoid slashing my assets, which should be related to the outgoing blocks of the node, so as long as I have the CPU and Mem up to par, is it possible to run the node. (And then I don't have to worry about my assets getting slashed?

I cannot tell you that on a case-by-case basis. These are just recommendations; they are not hard requirements nor exhaustive.
It allows for quickly sorting out bad hosters and VM types. That is mostly the indented use-case.

For more concrete advice you can ask in the 1KV Program matrix/discord chat, they often talk about their server hardware.

@ggwpez ggwpez requested a review from shawntabrizi February 10, 2023 15:38
@the-right-joyce the-right-joyce added B1-note_worthy Changes should be noted in the release notes T0-node This PR/Issue is related to the topic “node”. C1-low PR touches the given topic and has a low impact on builders. and removed B5-clientnoteworthy labels Feb 13, 2023
@shawntabrizi
Copy link
Member

shawntabrizi commented Feb 20, 2023

What is the sTPS after this change? (as compared to 1,500)

@99Kies
Copy link

99Kies commented Feb 21, 2023

@ggwpez I can't find the matrix/discord link you mentioned. I would like to confirm that the machines running kusama are not also of this standard. Aren't there many parallel chains on the kusama network. Compared to polkadot, does kusama require the same machine performance as polkadot?

@shawntabrizi
Copy link
Member

@99Kies if this PR goes in, you should probably assume that both Polkadot and Kusama are updated to the same standard. Generally speaking, Kusama will always be as close to Polkadot as possible in these kinds of things, as that is the purpose of the canary network.

@bkchr
Copy link
Member

bkchr commented Feb 22, 2023

BTW the discussion around this change continues in the issue: #13308

We will not merge this before we have not solved the weight noise issue.

@ggwpez
Copy link
Member Author

ggwpez commented Feb 23, 2023

@99Kies see https://thousand-validators.kusama.network/#/getting-started the matrix channel is #KusamaValidatorLounge:polkadot.builders.

What is the sTPS after this change? (as compared to 1,500)

Currently getting a build error in the VM ref hardware image 🤦‍♂️. Will need to update the DB weights first.

@stale
Copy link

stale bot commented Mar 25, 2023

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the A3-stale label Mar 25, 2023
@ggwpez ggwpez removed the A3-stale label Mar 25, 2023
@stale
Copy link

stale bot commented Apr 24, 2023

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the A3-stale label Apr 24, 2023
@ggwpez ggwpez removed the A3-stale label Apr 25, 2023
@stale
Copy link

stale bot commented May 25, 2023

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the A3-stale label May 25, 2023
@ggwpez ggwpez removed the A3-stale label May 25, 2023
@stale
Copy link

stale bot commented Jun 24, 2023

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the A3-stale label Jun 24, 2023
@stale stale bot removed the A3-stale label Jun 25, 2023
@kianenigma
Copy link
Contributor

Not sure if the aforementioned issues are solved, but seems like something that we want to see merged.

@ggwpez
Copy link
Member Author

ggwpez commented Jun 25, 2023

@oleg-plakida just to check: Substrate and Polkadot is updated? Now just Cumulus, or?

@mateo-moon
Copy link
Contributor

@oleg-plakida just to check: Substrate and Polkadot is updated? Now just Cumulus, or?

Rococo and cumulus remain.

@ggwpez
Copy link
Member Author

ggwpez commented Jun 27, 2023

bot rebase

@paritytech-processbot
Copy link

Rebased

@ggwpez ggwpez requested a review from a team June 28, 2023 08:21
@ggwpez
Copy link
Member Author

ggwpez commented Jun 28, 2023

bot merge

@paritytech-processbot
Copy link

Error: Required status check "pr-custom-review" is cancelled.

@ggwpez
Copy link
Member Author

ggwpez commented Jun 28, 2023

bot merge

@paritytech-processbot paritytech-processbot bot merged commit fe2d513 into master Jun 28, 2023
@paritytech-processbot paritytech-processbot bot deleted the oty-reference-hardware branch June 28, 2023 21:16
nathanwhit pushed a commit to nathanwhit/substrate that referenced this pull request Jul 19, 2023
* Remove Polkadot Wiki

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Update requirements for new ref hardware

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Add test

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: parity-processbot <>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B1-note_worthy Changes should be noted in the release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update hardware requirements for benchmark machine
8 participants