Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log monitoring errors as warning and reduce verbosity #5287

Merged
merged 2 commits into from
Mar 21, 2023

Conversation

nflaig
Copy link
Member

@nflaig nflaig commented Mar 21, 2023

Motivation

Similar to changes introduced in #5285, the monitoring service is just a sidecar to the main functionality/process and should be considered non-critical, meaning it should be fine here if we just log this as a warning.

Description

Log monitoring errors as warn instead of error and reduce verbosity (no stacktrace).

The expected errors here are

  • invalid domain, cannot resolve
  • timeout, remote service is unresponsive
  • dns resolution issues

The error message is enough to convey the issue, the stacktrace is not needed. Based on my observations using this for several weeks the errors I usually see are becasue of timeouts which are kinda expected since we send request 24/7 every minute and the remote service might be unresponsive sometimes.

@nflaig nflaig requested a review from a team as a code owner March 21, 2023 14:14
@nflaig nflaig added scope-logging Issue about logs: hygiene, format issues, improvements. scope-ux Issues for CLI UX or general consumer UX. labels Mar 21, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Mar 21, 2023

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 4997c15 Previous: 5fde8c5 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 916.18 us/op 1.0351 ms/op 0.89
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 71.769 us/op 59.898 us/op 1.20
BLS verify - blst-native 1.3561 ms/op 1.2594 ms/op 1.08
BLS verifyMultipleSignatures 3 - blst-native 2.7603 ms/op 2.5697 ms/op 1.07
BLS verifyMultipleSignatures 8 - blst-native 5.6305 ms/op 5.4533 ms/op 1.03
BLS verifyMultipleSignatures 32 - blst-native 23.275 ms/op 19.614 ms/op 1.19
BLS aggregatePubkeys 32 - blst-native 27.969 us/op 26.494 us/op 1.06
BLS aggregatePubkeys 128 - blst-native 122.74 us/op 102.74 us/op 1.19
getAttestationsForBlock 83.224 ms/op 75.959 ms/op 1.10
isKnown best case - 1 super set check 290.00 ns/op 272.00 ns/op 1.07
isKnown normal case - 2 super set checks 280.00 ns/op 270.00 ns/op 1.04
isKnown worse case - 16 super set checks 305.00 ns/op 274.00 ns/op 1.11
CheckpointStateCache - add get delete 6.1060 us/op 6.8110 us/op 0.90
validate gossip signedAggregateAndProof - struct 2.9172 ms/op 3.0399 ms/op 0.96
validate gossip attestation - struct 1.4026 ms/op 1.4100 ms/op 0.99
pickEth1Vote - no votes 1.5577 ms/op 1.4910 ms/op 1.04
pickEth1Vote - max votes 18.370 ms/op 14.026 ms/op 1.31
pickEth1Vote - Eth1Data hashTreeRoot value x2048 11.170 ms/op 10.381 ms/op 1.08
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 21.227 ms/op 18.528 ms/op 1.15
pickEth1Vote - Eth1Data fastSerialize value x2048 983.59 us/op 863.86 us/op 1.14
pickEth1Vote - Eth1Data fastSerialize tree x2048 7.9828 ms/op 8.4557 ms/op 0.94
bytes32 toHexString 773.00 ns/op 880.00 ns/op 0.88
bytes32 Buffer.toString(hex) 478.00 ns/op 483.00 ns/op 0.99
bytes32 Buffer.toString(hex) from Uint8Array 665.00 ns/op 706.00 ns/op 0.94
bytes32 Buffer.toString(hex) + 0x 464.00 ns/op 482.00 ns/op 0.96
Object access 1 prop 0.24000 ns/op 0.24000 ns/op 1.00
Map access 1 prop 0.18300 ns/op 0.19000 ns/op 0.96
Object get x1000 10.548 ns/op 10.202 ns/op 1.03
Map get x1000 0.76300 ns/op 0.79000 ns/op 0.97
Object set x1000 86.523 ns/op 86.509 ns/op 1.00
Map set x1000 67.583 ns/op 60.091 ns/op 1.12
Return object 10000 times 0.32680 ns/op 0.33490 ns/op 0.98
Throw Error 10000 times 4.8211 us/op 5.7849 us/op 0.83
fastMsgIdFn sha256 / 200 bytes 3.9040 us/op 4.1000 us/op 0.95
fastMsgIdFn h32 xxhash / 200 bytes 344.00 ns/op 355.00 ns/op 0.97
fastMsgIdFn h64 xxhash / 200 bytes 515.00 ns/op 554.00 ns/op 0.93
fastMsgIdFn sha256 / 1000 bytes 13.209 us/op 13.062 us/op 1.01
fastMsgIdFn h32 xxhash / 1000 bytes 587.00 ns/op 505.00 ns/op 1.16
fastMsgIdFn h64 xxhash / 1000 bytes 706.00 ns/op 606.00 ns/op 1.17
fastMsgIdFn sha256 / 10000 bytes 111.28 us/op 114.27 us/op 0.97
fastMsgIdFn h32 xxhash / 10000 bytes 2.2630 us/op 2.2030 us/op 1.03
fastMsgIdFn h64 xxhash / 10000 bytes 1.6170 us/op 1.5750 us/op 1.03
enrSubnets - fastDeserialize 64 bits 2.2730 us/op 2.0530 us/op 1.11
enrSubnets - ssz BitVector 64 bits 730.00 ns/op 687.00 ns/op 1.06
enrSubnets - fastDeserialize 4 bits 235.00 ns/op 243.00 ns/op 0.97
enrSubnets - ssz BitVector 4 bits 685.00 ns/op 806.00 ns/op 0.85
prioritizePeers score -10:0 att 32-0.1 sync 2-0 129.66 us/op 151.67 us/op 0.85
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 171.68 us/op 211.78 us/op 0.81
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 206.25 us/op 249.13 us/op 0.83
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 384.79 us/op 400.64 us/op 0.96
prioritizePeers score 0:0 att 64-1 sync 4-1 478.56 us/op 479.63 us/op 1.00
array of 16000 items push then shift 1.7790 us/op 2.0090 us/op 0.89
LinkedList of 16000 items push then shift 10.378 ns/op 12.925 ns/op 0.80
array of 16000 items push then pop 124.89 ns/op 120.46 ns/op 1.04
LinkedList of 16000 items push then pop 10.624 ns/op 11.809 ns/op 0.90
array of 24000 items push then shift 2.6768 us/op 2.7860 us/op 0.96
LinkedList of 24000 items push then shift 9.7990 ns/op 11.623 ns/op 0.84
array of 24000 items push then pop 93.679 ns/op 113.25 ns/op 0.83
LinkedList of 24000 items push then pop 9.2000 ns/op 11.691 ns/op 0.79
intersect bitArray bitLen 8 14.198 ns/op 16.637 ns/op 0.85
intersect array and set length 8 110.19 ns/op 113.24 ns/op 0.97
intersect bitArray bitLen 128 46.358 ns/op 59.595 ns/op 0.78
intersect array and set length 128 1.3037 us/op 1.4959 us/op 0.87
Buffer.concat 32 items 2.8580 us/op 3.6010 us/op 0.79
Uint8Array.set 32 items 2.3610 us/op 3.5980 us/op 0.66
pass gossip attestations to forkchoice per slot 4.2646 ms/op 3.9472 ms/op 1.08
computeDeltas 3.4413 ms/op 4.3416 ms/op 0.79
computeProposerBoostScoreFromBalances 1.8529 ms/op 2.1033 ms/op 0.88
altair processAttestation - 250000 vs - 7PWei normalcase 3.3956 ms/op 3.7421 ms/op 0.91
altair processAttestation - 250000 vs - 7PWei worstcase 5.5236 ms/op 5.2757 ms/op 1.05
altair processAttestation - setStatus - 1/6 committees join 151.17 us/op 169.43 us/op 0.89
altair processAttestation - setStatus - 1/3 committees join 289.88 us/op 298.70 us/op 0.97
altair processAttestation - setStatus - 1/2 committees join 387.07 us/op 401.50 us/op 0.96
altair processAttestation - setStatus - 2/3 committees join 486.65 us/op 467.24 us/op 1.04
altair processAttestation - setStatus - 4/5 committees join 678.54 us/op 672.83 us/op 1.01
altair processAttestation - setStatus - 100% committees join 811.25 us/op 785.12 us/op 1.03
altair processBlock - 250000 vs - 7PWei normalcase 18.767 ms/op 23.620 ms/op 0.79
altair processBlock - 250000 vs - 7PWei normalcase hashState 29.984 ms/op 32.425 ms/op 0.92
altair processBlock - 250000 vs - 7PWei worstcase 51.105 ms/op 62.671 ms/op 0.82
altair processBlock - 250000 vs - 7PWei worstcase hashState 71.734 ms/op 76.277 ms/op 0.94
phase0 processBlock - 250000 vs - 7PWei normalcase 2.0524 ms/op 2.7342 ms/op 0.75
phase0 processBlock - 250000 vs - 7PWei worstcase 29.369 ms/op 34.619 ms/op 0.85
altair processEth1Data - 250000 vs - 7PWei normalcase 495.39 us/op 705.97 us/op 0.70
vc - 250000 eb 1 eth1 1 we 0 wn 0 - smpl 15 8.9470 us/op 11.871 us/op 0.75
vc - 250000 eb 0.95 eth1 0.1 we 0.05 wn 0 - smpl 219 24.503 us/op 36.242 us/op 0.68
vc - 250000 eb 0.95 eth1 0.3 we 0.05 wn 0 - smpl 42 12.262 us/op 16.722 us/op 0.73
vc - 250000 eb 0.95 eth1 0.7 we 0.05 wn 0 - smpl 18 9.7380 us/op 15.586 us/op 0.62
vc - 250000 eb 0.1 eth1 0.1 we 0 wn 0 - smpl 1020 116.35 us/op 133.69 us/op 0.87
vc - 250000 eb 0.03 eth1 0.03 we 0 wn 0 - smpl 11777 711.70 us/op 818.23 us/op 0.87
vc - 250000 eb 0.01 eth1 0.01 we 0 wn 0 - smpl 16384 924.22 us/op 917.92 us/op 1.01
vc - 250000 eb 0 eth1 0 we 0 wn 0 - smpl 16384 907.25 us/op 933.02 us/op 0.97
vc - 250000 eb 0 eth1 0 we 0 wn 0 nocache - smpl 16384 2.5182 ms/op 3.1007 ms/op 0.81
vc - 250000 eb 0 eth1 1 we 0 wn 0 - smpl 16384 1.5413 ms/op 1.6369 ms/op 0.94
vc - 250000 eb 0 eth1 1 we 0 wn 0 nocache - smpl 16384 4.0139 ms/op 5.5271 ms/op 0.73
Tree 40 250000 create 348.26 ms/op 478.29 ms/op 0.73
Tree 40 250000 get(125000) 196.53 ns/op 205.21 ns/op 0.96
Tree 40 250000 set(125000) 1.0402 us/op 1.1787 us/op 0.88
Tree 40 250000 toArray() 21.173 ms/op 22.856 ms/op 0.93
Tree 40 250000 iterate all - toArray() + loop 21.571 ms/op 22.865 ms/op 0.94
Tree 40 250000 iterate all - get(i) 78.416 ms/op 74.933 ms/op 1.05
MutableVector 250000 create 12.296 ms/op 10.921 ms/op 1.13
MutableVector 250000 get(125000) 6.6680 ns/op 6.6700 ns/op 1.00
MutableVector 250000 set(125000) 278.63 ns/op 326.33 ns/op 0.85
MutableVector 250000 toArray() 3.7316 ms/op 3.8860 ms/op 0.96
MutableVector 250000 iterate all - toArray() + loop 3.4127 ms/op 4.3031 ms/op 0.79
MutableVector 250000 iterate all - get(i) 1.5309 ms/op 1.5415 ms/op 0.99
Array 250000 create 2.9138 ms/op 3.2786 ms/op 0.89
Array 250000 clone - spread 1.2332 ms/op 1.3332 ms/op 0.92
Array 250000 get(125000) 0.62900 ns/op 0.64500 ns/op 0.98
Array 250000 set(125000) 0.70300 ns/op 0.83500 ns/op 0.84
Array 250000 iterate all - loop 93.128 us/op 103.56 us/op 0.90
effectiveBalanceIncrements clone Uint8Array 300000 48.603 us/op 41.756 us/op 1.16
effectiveBalanceIncrements clone MutableVector 300000 389.00 ns/op 414.00 ns/op 0.94
effectiveBalanceIncrements rw all Uint8Array 300000 174.63 us/op 178.34 us/op 0.98
effectiveBalanceIncrements rw all MutableVector 300000 88.054 ms/op 136.25 ms/op 0.65
phase0 afterProcessEpoch - 250000 vs - 7PWei 123.82 ms/op 122.27 ms/op 1.01
phase0 beforeProcessEpoch - 250000 vs - 7PWei 47.319 ms/op 40.050 ms/op 1.18
altair processEpoch - mainnet_e81889 313.32 ms/op 370.12 ms/op 0.85
mainnet_e81889 - altair beforeProcessEpoch 66.877 ms/op 86.724 ms/op 0.77
mainnet_e81889 - altair processJustificationAndFinalization 21.117 us/op 24.943 us/op 0.85
mainnet_e81889 - altair processInactivityUpdates 5.8405 ms/op 7.5082 ms/op 0.78
mainnet_e81889 - altair processRewardsAndPenalties 73.609 ms/op 61.609 ms/op 1.19
mainnet_e81889 - altair processRegistryUpdates 4.7080 us/op 6.0620 us/op 0.78
mainnet_e81889 - altair processSlashings 1.0510 us/op 823.00 ns/op 1.28
mainnet_e81889 - altair processEth1DataReset 1.0960 us/op 823.00 ns/op 1.33
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.1347 ms/op 1.4275 ms/op 1.50
mainnet_e81889 - altair processSlashingsReset 6.7410 us/op 8.3580 us/op 0.81
mainnet_e81889 - altair processRandaoMixesReset 7.1550 us/op 13.163 us/op 0.54
mainnet_e81889 - altair processHistoricalRootsUpdate 1.2400 us/op 1.9770 us/op 0.63
mainnet_e81889 - altair processParticipationFlagUpdates 7.7590 us/op 4.7630 us/op 1.63
mainnet_e81889 - altair processSyncCommitteeUpdates 1.8360 us/op 1.3880 us/op 1.32
mainnet_e81889 - altair afterProcessEpoch 149.37 ms/op 166.62 ms/op 0.90
phase0 processEpoch - mainnet_e58758 392.53 ms/op 489.28 ms/op 0.80
mainnet_e58758 - phase0 beforeProcessEpoch 181.87 ms/op 226.50 ms/op 0.80
mainnet_e58758 - phase0 processJustificationAndFinalization 26.219 us/op 32.326 us/op 0.81
mainnet_e58758 - phase0 processRewardsAndPenalties 71.398 ms/op 101.08 ms/op 0.71
mainnet_e58758 - phase0 processRegistryUpdates 10.138 us/op 16.752 us/op 0.61
mainnet_e58758 - phase0 processSlashings 826.00 ns/op 1.6980 us/op 0.49
mainnet_e58758 - phase0 processEth1DataReset 922.00 ns/op 2.0670 us/op 0.45
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.1834 ms/op 4.1093 ms/op 0.29
mainnet_e58758 - phase0 processSlashingsReset 5.3330 us/op 10.032 us/op 0.53
mainnet_e58758 - phase0 processRandaoMixesReset 5.8060 us/op 12.172 us/op 0.48
mainnet_e58758 - phase0 processHistoricalRootsUpdate 639.00 ns/op 1.7560 us/op 0.36
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.5720 us/op 10.919 us/op 0.42
mainnet_e58758 - phase0 afterProcessEpoch 104.37 ms/op 117.19 ms/op 0.89
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.3963 ms/op 2.1250 ms/op 0.66
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.4799 ms/op 3.6037 ms/op 0.41
altair processInactivityUpdates - 250000 normalcase 24.431 ms/op 44.222 ms/op 0.55
altair processInactivityUpdates - 250000 worstcase 27.463 ms/op 37.049 ms/op 0.74
phase0 processRegistryUpdates - 250000 normalcase 9.7300 us/op 14.827 us/op 0.66
phase0 processRegistryUpdates - 250000 badcase_full_deposits 404.94 us/op 434.07 us/op 0.93
phase0 processRegistryUpdates - 250000 worstcase 0.5 183.24 ms/op 220.62 ms/op 0.83
altair processRewardsAndPenalties - 250000 normalcase 78.429 ms/op 85.893 ms/op 0.91
altair processRewardsAndPenalties - 250000 worstcase 82.992 ms/op 85.295 ms/op 0.97
phase0 getAttestationDeltas - 250000 normalcase 9.3686 ms/op 9.1746 ms/op 1.02
phase0 getAttestationDeltas - 250000 worstcase 10.193 ms/op 8.9715 ms/op 1.14
phase0 processSlashings - 250000 worstcase 4.6746 ms/op 4.3943 ms/op 1.06
altair processSyncCommitteeUpdates - 250000 235.44 ms/op 211.59 ms/op 1.11
BeaconState.hashTreeRoot - No change 319.00 ns/op 296.00 ns/op 1.08
BeaconState.hashTreeRoot - 1 full validator 65.494 us/op 55.453 us/op 1.18
BeaconState.hashTreeRoot - 32 full validator 747.46 us/op 570.60 us/op 1.31
BeaconState.hashTreeRoot - 512 full validator 7.7766 ms/op 6.9536 ms/op 1.12
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 87.828 us/op 66.636 us/op 1.32
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.1985 ms/op 1.0002 ms/op 1.20
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 16.035 ms/op 14.040 ms/op 1.14
BeaconState.hashTreeRoot - 1 balances 58.183 us/op 54.294 us/op 1.07
BeaconState.hashTreeRoot - 32 balances 510.43 us/op 509.65 us/op 1.00
BeaconState.hashTreeRoot - 512 balances 6.1669 ms/op 5.3842 ms/op 1.15
BeaconState.hashTreeRoot - 250000 balances 90.245 ms/op 84.075 ms/op 1.07
aggregationBits - 2048 els - zipIndexesInBitList 31.052 us/op 30.975 us/op 1.00
regular array get 100000 times 53.569 us/op 36.699 us/op 1.46
wrappedArray get 100000 times 60.399 us/op 39.868 us/op 1.51
arrayWithProxy get 100000 times 20.582 ms/op 20.621 ms/op 1.00
ssz.Root.equals 1.2530 us/op 743.00 ns/op 1.69
byteArrayEquals 872.00 ns/op 712.00 ns/op 1.22
shuffle list - 16384 els 11.334 ms/op 8.5604 ms/op 1.32
shuffle list - 250000 els 129.26 ms/op 125.80 ms/op 1.03
processSlot - 1 slots 10.651 us/op 12.139 us/op 0.88
processSlot - 32 slots 1.5659 ms/op 1.6515 ms/op 0.95
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 51.935 ms/op 50.654 ms/op 1.03
getCommitteeAssignments - req 1 vs - 250000 vc 4.5822 ms/op 3.4596 ms/op 1.32
getCommitteeAssignments - req 100 vs - 250000 vc 4.7663 ms/op 4.7035 ms/op 1.01
getCommitteeAssignments - req 1000 vs - 250000 vc 7.3295 ms/op 5.6233 ms/op 1.30
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 10.420 ns/op 7.1100 ns/op 1.47
state getBlockRootAtSlot - 250000 vs - 7PWei 1.6070 us/op 1.0547 us/op 1.52
computeProposers - vc 250000 12.376 ms/op 14.924 ms/op 0.83
computeEpochShuffling - vc 250000 122.02 ms/op 126.67 ms/op 0.96
getNextSyncCommittee - vc 250000 257.14 ms/op 223.84 ms/op 1.15

by benchmarkbot/action

wemeetagain
wemeetagain previously approved these changes Mar 21, 2023
expect(logger.error).to.have.been.calledWithMatch("Failed to send client stats", {}, error);
function assertError(error: {message: string}): void {
// errors are not thrown and need to be asserted based on the log
expect(logger.warn).to.have.been.calledWithMatch("Failed to send client stats", {reason: error.message});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions here are a bit brittle due to checking the error messages, but should not break unless the log level is changed and having this stricter assertion is important to ensure we are actually testing correctly.

@wemeetagain wemeetagain merged commit de53bad into unstable Mar 21, 2023
@wemeetagain wemeetagain deleted the nflaig/reduce-monitoring-log-verbosity branch March 21, 2023 15:56
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.7.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope-logging Issue about logs: hygiene, format issues, improvements. scope-ux Issues for CLI UX or general consumer UX.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants