-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - KES metrics are not being reported on tag 1.19.0 #1742
Comments
Alongside my 1.19.0 test I rotated the KES keys using a local Docker Ubuntu container running the same v1.19.0. Those newly created KES keys seem to be wrong, as they result in KES not reporting. Uploading the old KES keys to a v1.19.0 block producer results in KES reporting normally. So something is fishy with KES creation in v1.19.0 |
Old KES keys produce this trace: Aug 20 20:22:59 ip-172-31-33-176 cardano-node[88751]: [ip-172-3:cardano.node.Forge:Debug:52] [2020-08-20 20:22:59.71 UTC] OptSkip (OptCons (WrapForgeStateInfo {unwrapForgeStateInfo = KESInfo {kesStartPeriod = KESPeriod 40, kesEndPeriod = KESPeriod 102, kesEvolution = 9}}) OptNil) New KES keys don't produce a cardano.node.Forge:Debug trace. |
That debug log was there for testing a specific problem with KES rotation. Debug logs are not expected to be stable and are expected to be added/removed at any point. I can confirm the KES metrics that we monitor show correctly after redeploying with a new KES key:
|
I also regenerated KES keys and deployed them before checking this (they were initially deployed at KES period 0) |
I will close the issue. There appears to be no problem. |
The key evolution code was refactored in 1.19.0. Now, such a debug message is only traced when the key actually evolved instead of once at the start of each slot (if the node is configured to produce blocks). As @disassembler said, debug logs are not expected to be stable. |
I don't care about the debug logs. It was an attempt to give you some insight. The metrics are not present. Plain and simple. I uploaded new KES and opcert with the start KES period being the present KES period. And the KES metrics don't show up. |
Ah, reading your messages, I had the impression that you were talking about the missing trace messages. Is #1748 what you mean? |
No, EKG > Prometheus > Grafana. cardano_node_Forge_metrics_remainingKESPeriods_int is not present. Could it be, that this trace is actually necessary to get a KES metric at all. |
I'm still testing and I'm getting closer. I will keep you updated. |
Ah
Yeah, I'm now thinking that that's indeed the reason. That's also the cause of #1748. Thanks for the info! I'll see what I can do about it (e.g., trace it once on startup, even when the key didn't have to evolve). |
@kevinhammond Closing a bug report without proper investigation shouldn't be the norm, just saying. |
I agree with @MarcelKlammer , I was also a bit shocked how quickly it was closed without looking into details. |
@mrBliss I can confirm the behavior now. Just regenerated KES with --kes-period 48 instead of 49 (current) and node v1.19.0 now has KES metrics EKG/Prometheus/Grafana showing up. |
A fix for this: IntersectMBO/ouroboros-network#2529. |
2529: Always trace ForgeStateInfo, even when it didn't change r=mrBliss a=mrBliss In `cardano-node`, the KES information (grafana metrics and LiveView) will only be shown when `ForgeStateInfo` has been traced. When (re)starting a node, that won't happen until the KES key has evolved, which, for mainnet, can take up to 36 hours. Related issues: IntersectMBO/cardano-node#1742, IntersectMBO/cardano-node#1748. Fix it by tracing the `ForgeStateInfo`, whether it was updated or unchanged. This means it will be traced once every slot, i.e., second, if the node is configured to produce blocks. It is up to the tracers in `cardano-node` to suppress duplicate messages. Co-authored-by: Thomas Winant <thomas@well-typed.com>
Hi Marcel. We did do the investigation (we were concerned that there might be a KES key acceptance problem, which would have needed urgent attention). I was told it was expected behaviour, but of course the issue was with logging rather than generation, as you have noted. We've seen quite a few things coming up with LiveView, unfortunately. |
On KES rotation the starting period is now "current period -1" to accomomate for the following 1.19.0 issue, so the KES period has already evolved to keep the KES metrics: IntersectMBO/cardano-node#1742
* Update cntools.library On KES rotation the starting period is now "current period -1" to accomomate for the following 1.19.0 issue, so the KES period has already evolved to keep the KES metrics: IntersectMBO/cardano-node#1742 * Update patch version number * Update cntools-changelog.md Co-authored-by: Priyank <3169068+rdlrt@users.noreply.github.com>
Exernal
Summary
KES reporting not available on v1.19.0
Steps to reproduce
Update your block producer to v1.19.0. See metrics. No KES
Expected behavior
KES should be reported.
System info (please complete the following information):
cardano-node 1.19.0 - linux-x86_64 - ghc-8.6
git rev 4814003
cardano-cli 1.19.0 - linux-x86_64 - ghc-8.6
git rev 4814003
The text was updated successfully, but these errors were encountered: