Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update icq healthcheck #36

Open
baabeetaa opened this issue Dec 16, 2023 · 1 comment
Open

update icq healthcheck #36

baabeetaa opened this issue Dec 16, 2023 · 1 comment

Comments

@baabeetaa
Copy link
Contributor

  • qsd_ibc_channel_state - 0 = init, 1 = open, 2 = closed - if it's not 1 we have issues (0 means channel has been reopened and hermes is not moving channel handshakes about; 2 means channel is closed and needs reopening (and relayers configuring with new channels)
  • qsd_ibc_commitments and qsd_ibc_acks - 0 = no packets/acks pending. non-zero and not decreasing == relaying issues
  • qsd_icq_oldest_emission_distance - the age of the current oldest icq request. this should never really be over 200 in normal operation
  • qsd_icq_historic_queue - size of icq unanswered queries queue - should generally be below 100 (and mostly below 10). if persistently above this, there is some issue. (edited)
@baabeetaa
Copy link
Contributor Author

if packets are not shipped, channels will timeout and close after six hours
qsd_ibc_commitments and qsd_ibc_acks are the appropriate metrics; they should always return to zero (sometimes peak at around 100 before returning back down to zero)

qsd_icq_oldest_emission_distance is probably the surest indicator something is wrong. many icq responses will fail if the delegate channel is closed.
if this starts growing 500+ there is some issue (either funds, rpc down or channel closure)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant