-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add simulate errors metrics #3846
feat: add simulate errors metrics #3846
Conversation
use crate::error::ErrorDetail::*; | ||
|
||
match e.detail() { | ||
GrpcStatus(detail) => detail.status.code().to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one is to prevent error_descriptions like gRPC call
send_tx_simulate failed with status: status: Unknown, message: "spendable balance 2950583uosmo is smaller than 3000001uosmo: insufficient funds: insufficient funds [osmosis-labs/osmosis/v22/x/txfees/keeper/feedecorator.go:294] With gas wanted: '300000000' and gas used: '529599' ", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "x-cosmos-block-height": "13728350"} }
, and to kinda group them by error message, not sure if that can be done in a better way, please let me know if so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I am not sure if there's a way to get the code of the underlying gRPC error as they are all Unknown
, if there is, I guess it would be better to use this one (and its text description), but not sure if it is possible
@romac can you take a look? not sure who to ask for a review here as this is my first PR towards this repo |
Wow, that was quick, thank you! Will take a look next week, but looks good already at first glance :) |
I think as long as gRPC is used fo the simulation, Hermes will get tonic’s Status error which won’t contain the SDK/ibc-go error code, so it loses some error information. I don’t see how to improve the labels for the simulation errors. This is already a great indicator to spot if something is not working properly, thank you!! I’m not sure if it would be possible to use the RPC to simulate the Tx, thus getting a Tendermint's |
As far as I know, tx simulation is a feature of the SDK, not Comet/Tendermint, and is only available as a gRPC call. We could issue the very same gRPC call over RPC via ABCI but that wouldn't change the response format we get. |
@ljoss17 anything to fix here or it's good as it is? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you!!
Some small nit in the changelog, and could you move the changelog to features/ibc-telemetry
.changelog/unreleased/features/3845-add-simulate-errors-metric.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com>
@ljoss17 done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!!
@freak12techno Thanks so much for the PR! 🙏 |
* Update release-template.md to include a workflow with the comms team (informalsystems#3827) * Update link to IBC website (informalsystems#3834) * fix: use the consensus state at client latest height in status CLI (informalsystems#3829) * Use the consensus state at client latest height in status CLI * Add changelog * Index fetched data by the given chain name to account for mismatch between name in chain registry and chain identifier (informalsystems#3808) * Index fetched data by the given chain name to account for mismatch between name in chain registry and chain identifier * Show output when fetching chain data * fix: add syncing check for gRPC node (informalsystems#3833) * Add syncing check for gRPC node. * Fix comment. * Add changelog * Use cosmos.nix S3 cache on CI (informalsystems#3842) * Bump ics23 from 0.11.0 to 0.11.1 (informalsystems#3839) Bumps [ics23](https://github.com/cosmos/ics23) from 0.11.0 to 0.11.1. - [Release notes](https://github.com/cosmos/ics23/releases) - [Changelog](https://github.com/cosmos/ics23/blob/master/CHANGELOG.md) - [Commits](cosmos/ics23@rust/v0.11.0...rust/v0.11.1) --- updated-dependencies: - dependency-name: ics23 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Build multi-platform image on macOS runner to speed up build (informalsystems#3843) * Remove assumption that ICS-04 packet data is a valid UTF-8 string (informalsystems#3768) * Do not assume JSON-encoded packet data by using the `packet_data_hex` attribute instead of deprecated `packet_data` Relying on the `packet_data` attribute enforces a UTF-8 encoded payload (eg. JSON), disallowing Protobuf-encoded payloads which we are starting to see in the wild. The `packet_data` atttribute has been deprecated in favor of `packet_data_hex` since IBC-Go v1.0.0. [0] [0]: https://github.com/cosmos/ibc-go/blob/fadf8f2b0ab184798d021d220d877e00c7634e26/CHANGELOG.md?plain=1#L1417 * Ensure packet data is encoded to/decoded from lowercase hex * Refactor conversion from `RawObject` to `Packet` * Revert change in JSON serialization of packet data case as hex * Decode packets from `packet_data_hex` in NewBlock events as well * Bump ibc-proto to v0.41.0 * Use branch of ibc-proto with support for invalid UTF-8 event attributes * Update ibc-proto to v0.42.0 to finalize fix for non-UTF-8 packet data (informalsystems#3844) * Add legacy message to register ICA account for ibc-go versions prior to v8.1.0 --------- Co-authored-by: Luca Joss <luca@informal.systems> * Fix clippy warnings * Use latest nightly to run cargo-doc * Include banner in README.md (informalsystems#3854) The banner is similar to the rest of the IBC ecosystem repositories, eg [ibc-go](https://github.com/cosmos/ibc-go/blob/main/README.md) Signed-off-by: Adi Seredinschi <adi@informal.systems> * Bump jaxxstorm/action-install-gh-release from 1.10.0 to 1.11.0 (informalsystems#3848) Bumps [jaxxstorm/action-install-gh-release](https://github.com/jaxxstorm/action-install-gh-release) from 1.10.0 to 1.11.0. - [Release notes](https://github.com/jaxxstorm/action-install-gh-release/releases) - [Commits](jaxxstorm/action-install-gh-release@v1.10.0...v1.11.0) --- updated-dependencies: - dependency-name: jaxxstorm/action-install-gh-release dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update `curve25519-dalek` to its latest version to fix `cargo doc` job on nightly (informalsystems#3855) * Bump eyre from 0.6.11 to 0.6.12 (informalsystems#3851) * Bump moka from 0.12.4 to 0.12.5 (informalsystems#3849) * feat: add simulate errors metrics (informalsystems#3846) * feat: add simulate errors metrics * feat: add error message * chore: add docs * chore: add unclog entry * chore: cargo fmt * Update .changelog/unreleased/features/3845-add-simulate-errors-metric.md Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com> * chore: renamed unreleased file * Update changelog entry --------- Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com> Co-authored-by: Romain Ruetschi <romain@informal.systems> Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> * Bump tendermint-proto from 0.34.0 to 0.34.1 (informalsystems#3861) Bumps [tendermint-proto](https://github.com/informalsystems/tendermint-rs) from 0.34.0 to 0.34.1. - [Release notes](https://github.com/informalsystems/tendermint-rs/releases) - [Changelog](https://github.com/informalsystems/tendermint-rs/blob/v0.34.1/CHANGELOG.md) - [Commits](informalsystems/tendermint-rs@v0.34.0...v0.34.1) --- updated-dependencies: - dependency-name: tendermint-proto dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Improve reliability of compatibility check (informalsystems#3835) * Make ordered channels more resilient in the face of failing packets (informalsystems#3610) * Start scaffolding ica_ordered_channel test * Disable packet clearing * Add ica_ordered_channel test * Move some imports around * Clean up imports * Add sleep calls in between supervisor runs * Formatting * Fix compilation issues * Emphasize wording in documentation * Fill in code from discussion * Rename TrakingId::ClearId to TrackingId::PacketClearing * Compile ica ordered channel test under the ica feature flag * Cargo fmt * Move interchain_send_tx fn to test-framework crate * Cargo fmt * Update relayer config for consumer chain * Move ica_ordered_channel test under the ica feature * Move ica_transfer test under ica feature * Check that ICA channel is eventually established using the supervisor * Fix clippy warnings * Improve logs * Add changelog entry * Fix compilation of ICA tests * Add `force_disable_clear_on_start` config option, only available in test code * Cleanup * Check whether packet clear is needed instead of reacting to error when it fails * Force disable clear on start in ICA ordered channel test * Update changelog entry * Improve ICA ordered channel test asserts --------- Signed-off-by: Sean Chen <seanchen11235@gmail.com> Co-authored-by: Romain Ruetschi <106849+romac@users.noreply.github.com> Co-authored-by: Romain Ruetschi <romain@informal.systems> Co-authored-by: Luca Joss <luca@informal.systems> * Add `memo_overwrite` configuration (informalsystems#3863) * Add configuration to overwrite relayer memo * Add test for memo override * Add 'memo_overwrite' config documentation * Add changelog entry * Recover from gas simulation failures on legacy chains (informalsystems#3793) Closes: informalsystems#3792 Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> * Bump tempfile from 3.9.0 to 3.10.1 (informalsystems#3870) * Bump secp256k1 from 0.28.1 to 0.28.2 (informalsystems#3869) * Bump anyhow from 1.0.79 to 1.0.80 (informalsystems#3868) * Bump thiserror from 1.0.56 to 1.0.57 (informalsystems#3866) * Fix Rust toolchain nightly version to 2024-03-03 for cargo-doc CI job (informalsystems#3875) * Add configuration to skip packet sequences when clearing (informalsystems#3862) * Implement packet clearing filtering logic * Add tests for packet clearing filter * Add documentation * Add changelog entry * Add excluded sequences to `LinkParameters` struct * Fix sequence filter by adding setting it to be per-channel * Skip sequence filter test with Celestia due to the token filter module * Small refactor * Small cleanup --------- Co-authored-by: Romain Ruetschi <romain@informal.systems> * Improve out of gas error log (informalsystems#3874) * Add additional information for out of gas error * Add guide entry for troubleshooting gas errors * Add changelog entry * Apply suggestions from code review Co-authored-by: Romain Ruetschi <romain@informal.systems> Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> --------- Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Co-authored-by: Romain Ruetschi <romain@informal.systems> * Release Hermes v1.8.1 (informalsystems#3876) * Build release changelog * Bump version number * Fix typos * Update CHANGELOG.md Co-authored-by: Romain Ruetschi <romain@informal.systems> Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> * Update changelog --------- Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Co-authored-by: Romain Ruetschi <romain@informal.systems> * Fix Docker image workflow * Bump serde from 1.0.195 to 1.0.197 (informalsystems#3884) Bumps [serde](https://github.com/serde-rs/serde) from 1.0.195 to 1.0.197. - [Release notes](https://github.com/serde-rs/serde/releases) - [Commits](serde-rs/serde@v1.0.195...v1.0.197) --- updated-dependencies: - dependency-name: serde dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix parsing of IBC-Go version in health check and improve health check reporting (informalsystems#3888) * Add Injective to chains running tests in CI (informalsystems#3886) * Add Injective to chains running tests in CI * Add changelog and Injective to multi-chains-test * Fix ICS29 timeout fee test compatibility for ibc-go v8.1+ * Add compatibility to ICS29 tests for ibc-go v8.1+ * Fix typos * Fix `clear packets` CLI bug where `counterparty_channel_id` cannot be found (informalsystems#3890) * Use correct counterparty channel and port id when creating reverse link in packet clearing CLI * Add changelog * Change connection and handshake retry strategy to retry max 10 times over two blocks (5 times per block) (informalsystems#3864) * Change connection and handshake retry strategy to retry max 10 times over two blocks (5 times per block) * Add changelog entry * Update 3864-handshake-retry.md Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Signed-off-by: Romain Ruetschi <github@romac.me> * Release v1.8.2 (informalsystems#3891) * Bump version to 1.8.2 * Fix warnings on latest nightly * Create v1.8.2 changelog * Rephrase changelog summary * Reword changelog * Update CHANGELOG.md Co-authored-by: Luca Joss <luca@informal.systems> Signed-off-by: Romain Ruetschi <romain@informal.systems> * fix: fixed minimum-gas-prices healthcheck messages and make it more verbose/clear (informalsystems#3898) * fix: fixed minimum-gas-prices healthcheck messages and make it more verbose/clear * Update changelog entry * Small refactor --------- Co-authored-by: Romain Ruetschi <romain@informal.systems> * Proceed to next block after a few retries if Hermes can't parse current block during event sourcing (informalsystems#3906) Co-authored-by: Romain Ruetschi <romain@informal.systems> * Use workspace dependencies (informalsystems#3907) * Set `compat_mode` for pull mode in `hermes listen` command (informalsystems#3911) * set compat_mode for pull mode * add CHANGELOG * Use constant backoff in handshake retry strategy (informalsystems#3900) * Add action to determine and check the MSRV (informalsystems#3909) * Use actions-rust-lang action for setting up Rust * Add action to determine and check the MSRV * Update MSRV to 1.71.1 * Check MSRV on CI * Fix warning * Update cargo-doc nightly * Use latest cargo-msrv * Update guide/src/quick-start/pre-requisites.md Signed-off-by: Romain Ruetschi <github@romac.me> --------- Signed-off-by: Romain Ruetschi <github@romac.me> * Revert "Build multi-platform image on macOS runner to speed up build (informalsystems#3843)" (informalsystems#3892) This reverts commit 7cfb234. * Bump crossbeam-channel from 0.5.11 to 0.5.12 (informalsystems#3918) Bumps [crossbeam-channel](https://github.com/crossbeam-rs/crossbeam) from 0.5.11 to 0.5.12. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](crossbeam-rs/crossbeam@crossbeam-channel-0.5.11...crossbeam-channel-0.5.12) --- updated-dependencies: - dependency-name: crossbeam-channel dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump async-trait from 0.1.77 to 0.1.79 (informalsystems#3919) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.77 to 0.1.79. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](dtolnay/async-trait@0.1.77...0.1.79) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update release-template.md to consistently notify comms team (informalsystems#3908) Trying to find a way for the marketing/comms team to get early notification, consistently, that a new release will happen. @romac or Luca (not tagging b/c of holidays) if you prefer other approaches then happy to go with something else. Signed-off-by: Adi Seredinschi <adi@informal.systems> * Use pull event source when generating configuration with `hermes config auto` (informalsystems#3920) * Use pull event source when generating configuration with `hermes config auto` * Add changelog entry * WIP: begin v2 Penumbra support in Hermes This starts constructing a v2 of Penumbra support in Hermes, using the multi-chain backend support we added. We should be careful to write this code so that it's _maintainable_ as a set of changes on top of upstream Hermes. Thus, whenever I've encountered things that should change about the Hermes code generally, I've marked them with `TODO(extract)`. That way, we can note changes we'd like to make / need to make along the way, send them upstream as small, self-contained PRs, and then rebase this work once they're merged. The architecture we upstreamed should allow us to keep all the Penumbra-related code in a submodule, isolated from the rest of the Hermes code, so that future rebasing won't generate conflicts -- and ideally we could even merge support upstream, though that will probably need to wait until we can publish all the Penumbra crates. * add config fields probably needed for tm-related configs * start trying to run a view server * add cargo config needed to build penumbra crates * bootstrap boxed view service, implement health check and subscribe * implement some client and connection queries * implement the majority of query and event parsing * first pass at implementing transaciton building * fix conversion to IbcRelay * fixup remaining build errors, add compat mode to penumbra config * add tendermint light client to penumbra endpoint * implement penumbra chain::build_header * implement penumbra chain::verify_header and check_misbheavior * implement async tx submitting, add view_service_storage_dir config * add config-preview-celestia.toml config * extract client settings to a chain-agnostic structure * channel creation working: implement workaround for loadbalancer issue, application status * implement balance query using view client * async packet relay working now, refactored into build_penumbra_tx * build unbonding period using penumbra app parameters * rename celestia config * fixup a few missing penumbra branches in tests * add example osmosis config We already have example configs for a few chains, and our internal developer docs reference the file `config-penumbra-osmosis.toml`, so committed a lightly-sanitized version of same for clarity. * Merge Astria support into Penumbra branch (#16) * update rustfmt * implement `AstriaEndpoint` (#1) * begin AstriaEndpoint impl; update rustfmt * impl send_messages and verify_header * check_misbehaviour and bootstrap * bump deps and work on query methods * implement rest of queries, lint * getting stuff running! hermes working but issue on astria * increase default rpc timeout to 60s * add proof specs for astria * handle nonce in astria endpoint * use no_prehash for proof spec for now * update localnet config for osmosis * cleanup * fix astria address creation and cleanup * implement ICS20Withdrawal on astria for transfer command * merge w upstream (#2) * merge with upstream * add celestia config * remove unused config * bump astria, penumbra, and ibc-proto deps * Resolve penumbra dependencies across astria/penumbra * Use ibc-types v0.12.0 * Fix compilation with a bunch of todos * cargo fmt * Fix some test imports * Test fixes for Penumbra/Astria * Fix compilation and move penumbra/astria dependencies to recent versions --------- Co-authored-by: elizabeth <elizabethjbinks@gmail.com> Co-authored-by: noot <36753753+noot@users.noreply.github.com> * Missing Astria fixes from previous merge * Run CI on main branch instead of master (#20) * clippy * Change tokio features to include full... * try to fix CI with different dependency versions * config -> config.toml * Use tokio_unstable in github actions * Update min rust-version to 1.77.1 * msrv -> 1.77.1 * Fix clippy failures * skip Cargo.lock in codespell * Use tokio_unstable in all the GitHub Actions * Use old version of penumbra imports for astria * Further PR cleanup * remove redundant imports * Fix cargo-doc CI * Try to fix integration test disk space error * Use build cache and index page and unstable options for rust doc build * Further CI fixes for cargo doc build * Fix CI template --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Adi Seredinschi <adi@informal.systems> Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com> Signed-off-by: Sean Chen <seanchen11235@gmail.com> Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Signed-off-by: Romain Ruetschi <github@romac.me> Signed-off-by: Romain Ruetschi <romain@informal.systems> Co-authored-by: Adi Seredinschi <adi@informal.systems> Co-authored-by: Romain Ruetschi <romain@informal.systems> Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Luca Joss <luca@informal.systems> Co-authored-by: Romain Ruetschi <106849+romac@users.noreply.github.com> Co-authored-by: Sergey <83376337+freak12techno@users.noreply.github.com> Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com> Co-authored-by: Sean Chen <seanchen11235@gmail.com> Co-authored-by: Martin Dyring-Andersen <martin@dyring-andersen.dk> Co-authored-by: Jayden Lee <41176085+tkxkd0159@users.noreply.github.com> Co-authored-by: Henry de Valence <hdevalence@penumbralabs.xyz> Co-authored-by: Ava Howell <ava@avahowell.me> Co-authored-by: Conor Schaefer <conor@penumbralabs.xyz> Co-authored-by: elizabeth <elizabethjbinks@gmail.com> Co-authored-by: noot <36753753+noot@users.noreply.github.com>
Closes: #3845
Description
Adds a simulate_metric_total indicating when a transaction simulation fails, with the following labels: recoverable (can the execution continue if this happened?), account and error description.
I have a few of these in my logs:
and with these, this is something how this metric looks like locally:
PR author checklist:
unclog
.docs/
).Reviewer checklist:
Files changed
in the GitHub PR explorer.