Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Almost) all tx errors, except for "packet messages are redundant", are considered as "Tx Failure" in Prometheus metrics #1500

Closed
freak12techno opened this issue Sep 15, 2024 · 5 comments

Comments

@freak12techno
Copy link

freak12techno commented Sep 15, 2024

So I had this error:

2024-09-15T21:55:29.546751Z	error	Error sending messages	{"path_name": "cosmoshub-neutron", "src_chain_id": "cosmoshub-4", "dst_chain_id": "neutron-1", "src_client_id": "07-tendermint-1119", "dst_client_id": "07-tendermint-0", "error": "rpc error: code = Unknown desc = rpc error: code = Unknown desc = account sequence mismatch, expected 22, got 23: incorrect account sequence [cosmos/cosmos-sdk@v0.50.9/x/auth/ante/sigverify.go:290] with gas used: '442806': unknown request"}

And it should've been displayed in metrics as a metric with a separate cause, as it's listed here:

legacyerrors.ErrWrongSequence,
, but somehow it doesn't:

Image

Also, almost all of tx_errors seem to be either "packet messages are redundant", or "Tx Failure", except for one, since the server startup:

# TYPE cosmos_relayer_tx_errors_total counter
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="cosmoshub-4",path_name="cosmoshub-neutron"} 14
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="cosmoshub-4",path_name="cosmoshub-osmosis"} 7
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="jackal-1",path_name="jackal-osmosis"} 1
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="neutron-1",path_name="cosmoshub-neutron"} 7
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="neutron-1",path_name="neutron-osmosis"} 4
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="osmosis-1",path_name="cosmoshub-osmosis"} 40
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="osmosis-1",path_name="neutron-osmosis"} 8
cosmos_relayer_tx_errors_total{cause="Tx Failure",chain="osmosis-1",path_name="osmosis-sentinel"} 9
cosmos_relayer_tx_errors_total{cause="incorrect account sequence",chain="cosmoshub-4",path_name="cosmoshub-osmosis"} 1
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="bitsong-2b",path_name="bitsong-osmosis"} 17
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="cosmoshub-4",path_name="cosmoshub-neutron"} 322
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="cosmoshub-4",path_name="cosmoshub-osmosis"} 1232
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="jackal-1",path_name="jackal-osmosis"} 59
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="neutron-1",path_name="cosmoshub-neutron"} 310
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="neutron-1",path_name="neutron-osmosis"} 331
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="osmosis-1",path_name="bitsong-osmosis"} 17
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="osmosis-1",path_name="cosmoshub-osmosis"} 1321
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="osmosis-1",path_name="jackal-osmosis"} 59
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="osmosis-1",path_name="neutron-osmosis"} 297
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="osmosis-1",path_name="osmosis-sentinel"} 16
cosmos_relayer_tx_errors_total{cause="packet messages are redundant",chain="sentinelhub-2",path_name="osmosis-sentinel"} 7

Seems like something is wrong here.

JFYI, I am on the latest commit.

@Reecepbcups
Copy link
Member

Reecepbcups commented Sep 16, 2024

@freak12techno The account sequence error is automatically fixed during runtime to submit another Tx after re-parsing the new err msg expected value. Thus no events should be thrown for this anymore as there is nothing the event consumer would need to do

code:

func (cc *CosmosProvider) handleAccountSequenceMismatchError(sequenceGuard *WalletState, err error) {
if sequenceGuard == nil {
panic("sequence guard not configured")
}
matches := accountSeqRegex.FindStringSubmatch(err.Error())
if len(matches) == 0 {
return
}
nextSeq, err := strconv.ParseUint(matches[1], 10, 64)
if err != nil {
return
}
sequenceGuard.NextAccountSequence = nextSeq
}

@freak12techno
Copy link
Author

@Reecepbcups yeah, but I checked the logs during the period that's on my screenshot, and it has only this error in logs that I provided, that's why I think it's misclassified

@freak12techno
Copy link
Author

Actually I found one new of these errors, and it kinda matches the Tx Failure thing:

2024-09-16T20:31:32.428250Z	error	Error sending messages	{"path_name": "osmosis-sentinel", "src_chain_id": "sentinelhub-2", "dst_chain_id": "osmosis-1", "src_client_id": "07-tendermint-0", "dst_client_id": "07-tendermint-2", "error": "rpc error: code = Unknown desc = rpc error: code = Unknown desc = TrustedHeight {2 18035353} must be less than header height {2 18035353}: invalid header height [cosmos/ibc-go/v7@v7.4.1/modules/light-clients/07-tendermint/header.go:67] With gas wanted: '0' and gas used: '2306' : unknown request"}

Image

Do you think it might be the one that actually is represented by this metric value?

@jtieri
Copy link
Member

jtieri commented Sep 17, 2024

Actually I found one new of these errors, and it kinda matches the Tx Failure thing:

2024-09-16T20:31:32.428250Z	error	Error sending messages	{"path_name": "osmosis-sentinel", "src_chain_id": "sentinelhub-2", "dst_chain_id": "osmosis-1", "src_client_id": "07-tendermint-0", "dst_client_id": "07-tendermint-2", "error": "rpc error: code = Unknown desc = rpc error: code = Unknown desc = TrustedHeight {2 18035353} must be less than header height {2 18035353}: invalid header height [cosmos/ibc-go/v7@v7.4.1/modules/light-clients/07-tendermint/header.go:67] With gas wanted: '0' and gas used: '2306' : unknown request"}

Image

Do you think it might be the one that actually is represented by this metric value?

this seems correct. as @Reecepbcups pointed out, the relayer should be catching the account sequence errors and attempting to send a broadcast a new tx with the correct sequence number so i don't think those should end up being reported in metrics

@freak12techno
Copy link
Author

Gotcha, so I don't have any more questions here and will close it. Thanks you both for elaborating, that was really helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants