Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ordered channels more resilient in the face of failing packets #3610

Merged
merged 48 commits into from
Feb 28, 2024

Conversation

seanchen1991
Copy link
Contributor

@seanchen1991 seanchen1991 commented Sep 13, 2023

Closes: #3540

Description


PR author checklist:

  • Added changelog entry, using unclog.
  • Added tests: integration (for Hermes) or unit/mock tests (for modules).
  • Linked to GitHub issue.
  • Updated code comments and documentation (e.g., docs/).
  • Tagged one reviewer who will be the one responsible for shepherding this PR.

Reviewer checklist:

  • Reviewed Files changed in the GitHub PR explorer.
  • Manually tested (in case integration/unit/mock tests are absent).

@seanchen1991 seanchen1991 marked this pull request as ready for review November 10, 2023 15:01
@romac
Copy link
Member

romac commented Nov 13, 2023

The new test is failing with

Error: generic error

Caused by:
    error raised while creating client for chain ibcconsumer: failed when building client state

    Caused by:
        gRPC call `query_staking_params` failed with status: status: Unimplemented, message: "unknown service cosmos.staking.v1beta1.Query", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }

The ibcconsumer chain probably needs ccv_consumer to be set to true in its config.

@seanchen1991
Copy link
Contributor Author

The ibcconsumer chain probably needs ccv_consumer to be set to true in its config.

Does this need to be set in the test itself, or in an actual config file?

@romac
Copy link
Member

romac commented Nov 13, 2023

I actually wonder why this test is run at all using a consumer chain?

@ljoss17 Is that expected? Do we run other ICA tests using CCV chains?

@ljoss17
Copy link
Contributor

ljoss17 commented Nov 13, 2023

Does this need to be set in the test itself, or in an actual config file?

There are helper methods to bootstrap configurations for CCV chain tests: https://github.com/informalsystems/hermes/blob/master/tools/test-framework/src/util/interchain_security.rs#L18C13-L18C13
This is an example on how to use them:

@ljoss17 Is that expected? Do we run other ICA tests using CCV chains?

The issue happened when relaying for Stride, so I think it would be nicer to be able to run the test with CCV chains as we already have a job using Stride https://github.com/informalsystems/hermes/blob/master/.github/workflows/integration.yaml#L465. But I don't think it is necessary

Copy link
Member

@romac romac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @seanchen1991! 🙏

@romac romac marked this pull request as draft January 12, 2024 10:42
@romac
Copy link
Member

romac commented Jan 12, 2024

Thanks @ljoss17 for reminding me of some outstanding issues with that PR:

After looking more into the ordered channel resilience PR, I found a few issues.

When testing locally with the modified version, the test doesn’t pass because Hermes doesn’t consider the packet relaying as an Error in this match

match self.send_from_operational_data::<S>(&odata) {

This can be seen by the submitted log below.
The cause is the following:

  1. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/link/relay_path.rs#L783 call send_from_operation_data
  2. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/link/relay_sender.rs#L97 call submit
  3. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/chain/cosmos.rs#L1048 call send_messages_and_wait_commit
  4. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/chain/cosmos.rs#L719 call do_send_messages_and_wait_commit
  5. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/chain/cosmos/batch.rs#L93 call send_batched_messages_and_wait_check_tx
  6. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/chain/cosmos/retry.rs#L48 call send_tx_with_account_sequence_retry
  7. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/chain/cosmos/retry.rs#L77 call do_send_tx_with_account_sequence_retry
  8. https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/chain/cosmos/retry.rs#L150 return the response as an Ok() so the error isn’t caught by the relay_path match
    match self.send_from_operational_data::<S>(&odata) {
2023-11-15T12:44:49.184516Z INFO spawn:chain{chain=ibcconsumer}:client{client=07-tendermint-2}:connection{connection=connection-1}:channel{channel=channel-2}:worker.packet.cmd{src_chain=ibcconsumer src_port=icacontroller-stride1798cf2q3gmw5gkyv0u6wtqfy38fap2darsl0p3 src_channel=channel-2 dst_chain=ibcprovider}:relay{odata=ae8fba35 ->Destination @0-132; len=1}: ibc_relayer::link::relay_path: submitted
2023-11-15T12:44:49.184485Z INFO spawn:chain{chain=ibcconsumer}:client{client=07-tendermint-2}:connection{connection=connection-1}:channel{channel=channel-2}:worker.packet.cmd{src_chain=ibcconsumer src_port=icacontroller-stride1798cf2q3gmw5gkyv0u6wtqfy38fap2darsl0p3 src_channel=channel-2 dst_chain=ibcprovider}:relay{odata=ae8fba35 ->Destination @0-132; len=1}: ibc_relayer::link::relay_sender: response(s): 1; Error with code 21:B139F35FBDBA7EA960F723B9A90510C3FC6D7A6C2F5BDC38E5B165FB570E1AFE target_chain=ibcprovider

From my understanding and a little bit of testing manually, one solution would be to do some code matching around here for a code 21 (out of order packet) here

and return a new error, e.g. OutofOrderPacket.
This could then be caught here https://github.com/informalsystems/hermes/blob/master/crates/relayer/src/link/relay_sender.rs#L98 and return a new error, e.g. OutOfOrderSend. And this new error could be caught here triggering the packet clearing.

This is the diff I used to play around for this suggestion which is now able to trigger the special clearing, but it gets this error:

2023-11-15T13:46:10.063080Z ERROR spawn:chain{chain=ibcconsumer}:client{client=07-tendermint-2}:connection{connection=connection-1}:channel{channel=channel-2}:worker.packet.cmd{src_chain=ibcconsumer src_port=icacontroller-stride1hleh224w70xp8kmnjaywdfshhy2ek42r283592 src_channel=channel-2 dst_chain=ibcprovider}:relay{odata=5c460800 ->Destination @0-131; len=1}: ibc_relayer::link::relay_path: 5/5 retries exhausted. Giving up
Show diff
diff --git a/crates/relayer/src/chain/cosmos/retry.rs b/crates/relayer/src/chain/cosmos/retry.rs
index 4d0091fc9..5e995b13e 100644
--- a/crates/relayer/src/chain/cosmos/retry.rs
+++ b/crates/relayer/src/chain/cosmos/retry.rs
@@ -137,6 +137,13 @@ async fn do_send_tx_with_account_sequence_retry(
                     Ok(response)
                 }
 
+                Code::Err(code) if code.get() == 21u32 => {
+                    warn!("packet out of order errror");
+
+                    Err(Error::non_provable_data())
+
+                },
+
                 // Gas estimation succeeded, but broadcast_tx_sync failed with unrecoverable error.
                 Code::Err(code) => {
                     // Do not increase the account s.n. since CheckTx step of broadcast_tx_sync has failed.
diff --git a/crates/relayer/src/link/error.rs b/crates/relayer/src/link/error.rs
index adf8142ec..3ac278a5e 100644
--- a/crates/relayer/src/link/error.rs
+++ b/crates/relayer/src/link/error.rs
@@ -76,6 +76,9 @@ define_error! {
                 format!("chain error when sending messages: {0}", e.event)
             },
 
+        OutOfOrderSend
+            |_| { "tried to send a packet out of order" },
+
         MissingChannelId
             { chain_id: ChainId }
             |e| {
diff --git a/crates/relayer/src/link/relay_path.rs b/crates/relayer/src/link/relay_path.rs
index a821eb8cb..8cf904111 100644
--- a/crates/relayer/src/link/relay_path.rs
+++ b/crates/relayer/src/link/relay_path.rs
@@ -671,17 +671,18 @@ impl<ChainA: ChainHandle, ChainB: ChainHandle> RelayPath<ChainA, ChainB> {
 
                     return Ok(reply);
                 }
-                Err(LinkError(error::LinkErrorDetail::Send(e), _)) => {
-                    warn!(
+                Err(LinkError(error::LinkErrorDetail::Send(_), _)) | Err(LinkError(error::LinkErrorDetail::OutOfOrderSend(_), _)) => {
+                    /*warn!(
                         "Determining whether we need to retry in response to: {}",
                         e.event
-                    );
+                    );*/
 
                     // if on an ordered channel, perform a packet clearing, but only if we
                     // are not in the middle of another packet clearing process
                     if self.ordered_channel() && i == 0 && !odata.tracking_id.is_clearing() {
                         // Do we need to specify the height for the packet clearing?
                         // Probably not, since no progress will have been made on the clearing process
+                        tracing::warn!("special clear");
                         self.relay_pending_packets(None)?;
                     }
 
@@ -698,6 +699,7 @@ impl<ChainA: ChainHandle, ChainB: ChainHandle> RelayPath<ChainA, ChainB> {
                     }
                 }
                 Err(e) => {
+                    tracing::warn!("other link error: {e}");
                     // Unrecoverable error, propagate up the stack
                     return Err(e);
                 }
diff --git a/crates/relayer/src/link/relay_sender.rs b/crates/relayer/src/link/relay_sender.rs
index c4d29fe91..48fbbe5d0 100644
--- a/crates/relayer/src/link/relay_sender.rs
+++ b/crates/relayer/src/link/relay_sender.rs
@@ -7,6 +7,7 @@ use ibc_relayer_types::events::IbcEvent;
 
 use crate::chain::handle::ChainHandle;
 use crate::chain::tracking::TrackedMsgs;
+use crate::error::Error;
 use crate::link::error::LinkError;
 use crate::link::RelaySummary;
 use crate::util::pretty::{PrettyCode, PrettyEvents};
@@ -95,9 +96,14 @@ impl Submit for AsyncSender {
     type Reply = AsyncReply;
 
     fn submit(target: &impl ChainHandle, msgs: TrackedMsgs) -> Result<Self::Reply, LinkError> {
-        let a = target
-            .send_messages_and_wait_check_tx(msgs)
-            .map_err(LinkError::relayer)?;
+        let a = match target
+            .send_messages_and_wait_check_tx(msgs) {
+                Ok(r) => Ok(r),
+                Err(e) if e.detail().to_string() == Error::non_provable_data().detail().to_string() => {
+                    Err(LinkError::out_of_order_send())
+                },
+                Err(e) => Err(LinkError::relayer(e))
+            }?;
         let reply = AsyncReply { responses: a };
 
         // Note: There may be errors in the reply, for example:
diff --git a/crates/relayer/src/worker.rs b/crates/relayer/src/worker.rs
index 915c49ce6..0c4050a92 100644
--- a/crates/relayer/src/worker.rs
+++ b/crates/relayer/src/worker.rs
@@ -1,6 +1,6 @@
 use alloc::sync::Arc;
 use core::fmt::{Display, Error as FmtError, Formatter};
-use ibc_relayer_types::core::ics04_channel::channel::Ordering;
+//use ibc_relayer_types::core::ics04_channel::channel::Ordering;
 use serde::{Deserialize, Serialize};
 use std::sync::Mutex;
 use tracing::error;
@@ -124,9 +124,9 @@ pub fn spawn_worker_tasks<ChainA: ChainHandle, ChainB: ChainHandle>(
 
             match link_res {
                 Ok(link) => {
-                    let channel_ordering = link.a_to_b.channel().ordering;
+                    let _channel_ordering = link.a_to_b.channel().ordering;
                     let should_clear_on_start =
-                        packets_config.clear_on_start || channel_ordering == Ordering::Ordered;
+                        packets_config.clear_on_start;// || channel_ordering == Ordering::Ordered;
 
                     let (cmd_tx, cmd_rx) = crossbeam_channel::unbounded();
                     let link = Arc::new(Mutex::new(link));
diff --git a/tools/integration-test/src/tests/interchain_security/ica_ordered_channel.rs b/tools/integration-test/src/tests/interchain_security/ica_ordered_channel.rs
index 42dd0877d..05b61175c 100644
--- a/tools/integration-test/src/tests/interchain_security/ica_ordered_channel.rs
+++ b/tools/integration-test/src/tests/interchain_security/ica_ordered_channel.rs
@@ -66,6 +66,8 @@ impl TestOverrides for IcaOrderedChannelTest {
         config.mode.packets.clear_on_start = false;
         config.mode.packets.clear_interval = 0;
 
+        config.mode.clients.misbehaviour = false;
+
         update_relayer_config_for_consumer_chain(config);
     }
 

Comment on lines 680 to 689
// If on an ordered channel, perform a packet clearing, but only if we
// are not in the middle of another packet clearing process
if self.ordered_channel() && i == 0 && !odata.tracking_id.is_clearing() {
warn!("Failed to relay to ordered channel, attempting to recover by clearing packets");

// We do need to specify the height for the packet clearing,
// since no progress will have been made on the clearing process
self.relay_pending_packets(None)?;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, we should not check on error but rather when we attempt to clear a packet on an IBC event.
Here are some tentative diffs:

index 2eb85f633..4d4471143 100644
--- a/crates/relayer/src/worker/packet.rs
+++ b/crates/relayer/src/worker/packet.rs
@@ -17,12 +17,14 @@ use ibc_proto::ibc::apps::fee::v1::{IdentifiedPacketFees, QueryIncentivizedPacke
 use ibc_proto::ibc::core::channel::v1::PacketId;
 use ibc_relayer_types::applications::ics29_fee::events::IncentivizedPacket;
 use ibc_relayer_types::applications::transfer::{Amount, Coin, RawCoin};
+use ibc_relayer_types::core::ics04_channel::channel::Ordering;
 use ibc_relayer_types::core::ics04_channel::events::WriteAcknowledgement;
 use ibc_relayer_types::core::ics04_channel::packet::Sequence;
 use ibc_relayer_types::events::{IbcEvent, IbcEventType};
 use ibc_relayer_types::Height;

 use crate::chain::handle::ChainHandle;
+use crate::chain::requests::{IncludeProof, QueryHeight, QueryNextSequenceReceiveRequest};
 use crate::config::filter::FeePolicy;
 use crate::event::source::EventBatch;
 use crate::foreign_client::HasExpiredOrFrozenError;
@@ -186,7 +188,7 @@ pub fn spawn_incentivized_packet_cmd_worker<ChainA: ChainHandle, ChainB: ChainHa
 /// Receives worker commands and handles them accordingly.
 ///
 /// Given an `IbcEvent` command, updates the schedule and initiates
-/// packet clearing if the `should_clear_on_start` flag has been toggled.
+/// packet clearing if the `should_clear_on_start` flag has been set or dependent packets must be cleared on ordred channels.
 ///
 /// Given a `NewBlock` command, checks if packet clearing should occur
 /// and performs it if so.
@@ -205,7 +207,19 @@ fn handle_packet_cmd<ChainA: ChainHandle, ChainB: ChainHandle>(
     // Handle packet clearing which is triggered from a command
     let (do_clear, maybe_height) = match &cmd {
         WorkerCmd::IbcEvents { batch } => {
-            if *should_clear_on_start {
+            let channel_ordering = link.a_to_b.channel().ordering;
+            let (_next_seq, _) = link.a_to_b.dst_chain().query_next_sequence_receive(
+                    QueryNextSequenceReceiveRequest {
+                        port_id: link.a_to_b.dst_port_id().clone(),
+                        channel_id: link.a_to_b.dst_channel_id().clone(),
+                        height: QueryHeight::Specific(batch.height),
+                    },
+                    IncludeProof::No,
+                )
+                .map_err(|e| LinkError::query(link.a_to_b.dst_chain().id(), e))?;
+            // TODO - if _next_seq is smaller than the sequence of the first packet in batch, then we should clear
+            let ordered_clear_required = channel_ordering == Ordering::Ordered;
+            if *should_clear_on_start || ordered_clear_required {
                 (true, Some(batch.height))
             } else {
                 (false, None)

@calvinaco
Copy link

calvinaco commented Feb 15, 2024

Hi @seanchen1991 @romac @ancazamfir

Thanks for the great work @seanchen1991 !

I noticed this PR seems to be inactive for a while. We are now working on adding ICA support in next Cronos upgrade and making ORDERED channel more resilient in hermes is important to us.

May I ask what is the latest status of this PR? Thank you!

Signed-off-by: Sean Chen <seanchen11235@gmail.com>
@romac
Copy link
Member

romac commented Feb 20, 2024

@calvinaco Thanks for your interest in this PR! The PR is not quite ready yet and we haven't had time to get back to work more on it yet. We will report back when that's the case, hopefully by the end of next week.

@romac romac added this to the v1.9 milestone Feb 20, 2024
Copy link
Contributor

@ljoss17 ljoss17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, left two comments. Thanks for finalising this @romac !!

}

impl BinaryChannelTest for IcaOrderedChannelTest {
fn run<ChainA: ChainHandle, ChainB: ChainHandle>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove most of the sleeps in the test

Comment on lines 211 to 215
// Check that the ICA account's balance has been debited the sent amount.
chains.node_a.chain_driver().assert_eventual_wallet_amount(
&ica_address.as_ref(),
&stake_denom.with_amount(ica_fund - 3 * amount).as_ref(),
)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move this in the supervisor block to insure that the clearing is done, and maybe add an assertion that the amount is correctly received

@ljoss17 ljoss17 marked this pull request as ready for review February 28, 2024 13:37
@romac romac dismissed ancazamfir’s stale review February 28, 2024 16:32

Suggested solution was adopted

@romac romac merged commit c453676 into master Feb 28, 2024
33 checks passed
@romac romac deleted the sean/ordered-channels-resilience branch February 28, 2024 16:32
zbuc added a commit to penumbra-zone/hermes that referenced this pull request Apr 9, 2024
* Update release-template.md to include a workflow with the comms team (informalsystems#3827)

* Update link to IBC website (informalsystems#3834)

* fix: use the consensus state at client latest height in status CLI (informalsystems#3829)

* Use the consensus state at client latest height in status CLI

* Add changelog

* Index fetched data by the given chain name to account for mismatch between name in chain registry and chain identifier (informalsystems#3808)

* Index fetched data by the given chain name to account for mismatch between name in chain registry and chain identifier

* Show output when fetching chain data

* fix: add syncing check for gRPC node (informalsystems#3833)

* Add syncing check for gRPC node.

* Fix comment.

* Add changelog

* Use cosmos.nix S3 cache on CI (informalsystems#3842)

* Bump ics23 from 0.11.0 to 0.11.1 (informalsystems#3839)

Bumps [ics23](https://github.com/cosmos/ics23) from 0.11.0 to 0.11.1.
- [Release notes](https://github.com/cosmos/ics23/releases)
- [Changelog](https://github.com/cosmos/ics23/blob/master/CHANGELOG.md)
- [Commits](cosmos/ics23@rust/v0.11.0...rust/v0.11.1)

---
updated-dependencies:
- dependency-name: ics23
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build multi-platform image on macOS runner to speed up build (informalsystems#3843)

* Remove assumption that ICS-04 packet data is a valid UTF-8 string (informalsystems#3768)

* Do not assume JSON-encoded packet data by using the `packet_data_hex` attribute instead of deprecated `packet_data`

Relying on the `packet_data` attribute enforces a UTF-8 encoded payload (eg. JSON),
disallowing Protobuf-encoded payloads which we are starting to see in the wild.

The `packet_data` atttribute has been deprecated in favor of `packet_data_hex` since IBC-Go v1.0.0. [0]

[0]: https://github.com/cosmos/ibc-go/blob/fadf8f2b0ab184798d021d220d877e00c7634e26/CHANGELOG.md?plain=1#L1417

* Ensure packet data is encoded to/decoded from lowercase hex

* Refactor conversion from `RawObject` to `Packet`

* Revert change in JSON serialization of packet data case as hex

* Decode packets from `packet_data_hex` in NewBlock events as well

* Bump ibc-proto to v0.41.0

* Use branch of ibc-proto with support for invalid UTF-8 event attributes

* Update ibc-proto to v0.42.0 to finalize fix for non-UTF-8 packet data (informalsystems#3844)
  * Add legacy message to register ICA account for ibc-go versions prior to v8.1.0

---------

Co-authored-by: Luca Joss <luca@informal.systems>

* Fix clippy warnings

* Use latest nightly to run cargo-doc

* Include banner in README.md (informalsystems#3854)

The banner is similar to the rest of the IBC ecosystem repositories, eg [ibc-go](https://github.com/cosmos/ibc-go/blob/main/README.md)

Signed-off-by: Adi Seredinschi <adi@informal.systems>

* Bump jaxxstorm/action-install-gh-release from 1.10.0 to 1.11.0 (informalsystems#3848)

Bumps [jaxxstorm/action-install-gh-release](https://github.com/jaxxstorm/action-install-gh-release) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/jaxxstorm/action-install-gh-release/releases)
- [Commits](jaxxstorm/action-install-gh-release@v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: jaxxstorm/action-install-gh-release
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update `curve25519-dalek` to its latest version to fix `cargo doc` job on nightly (informalsystems#3855)

* Bump eyre from 0.6.11 to 0.6.12 (informalsystems#3851)

* Bump moka from 0.12.4 to 0.12.5 (informalsystems#3849)

* feat: add simulate errors metrics (informalsystems#3846)

* feat: add simulate errors metrics

* feat: add error message

* chore: add docs

* chore: add unclog entry

* chore: cargo fmt

* Update .changelog/unreleased/features/3845-add-simulate-errors-metric.md

Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>
Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com>

* chore: renamed unreleased file

* Update changelog entry

---------

Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com>
Co-authored-by: Romain Ruetschi <romain@informal.systems>
Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>

* Bump tendermint-proto from 0.34.0 to 0.34.1 (informalsystems#3861)

Bumps [tendermint-proto](https://github.com/informalsystems/tendermint-rs) from 0.34.0 to 0.34.1.
- [Release notes](https://github.com/informalsystems/tendermint-rs/releases)
- [Changelog](https://github.com/informalsystems/tendermint-rs/blob/v0.34.1/CHANGELOG.md)
- [Commits](informalsystems/tendermint-rs@v0.34.0...v0.34.1)

---
updated-dependencies:
- dependency-name: tendermint-proto
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Improve reliability of compatibility check (informalsystems#3835)

* Make ordered channels more resilient in the face of failing packets (informalsystems#3610)

* Start scaffolding ica_ordered_channel test

* Disable packet clearing

* Add ica_ordered_channel test

* Move some imports around

* Clean up imports

* Add sleep calls in between supervisor runs

* Formatting

* Fix compilation issues

* Emphasize wording in documentation

* Fill in code from discussion

* Rename TrakingId::ClearId to TrackingId::PacketClearing

* Compile ica ordered channel test under the ica feature flag

* Cargo fmt

* Move interchain_send_tx fn to test-framework crate

* Cargo fmt

* Update relayer config for consumer chain

* Move ica_ordered_channel test under the ica feature

* Move ica_transfer test under ica feature

* Check that ICA channel is eventually established using the supervisor

* Fix clippy warnings

* Improve logs

* Add changelog entry

* Fix compilation of ICA tests

* Add `force_disable_clear_on_start` config option, only available in test code

* Cleanup

* Check whether packet clear is needed instead of reacting to error when it fails

* Force disable clear on start in ICA ordered channel test

* Update changelog entry

* Improve ICA ordered channel test asserts

---------

Signed-off-by: Sean Chen <seanchen11235@gmail.com>
Co-authored-by: Romain Ruetschi <106849+romac@users.noreply.github.com>
Co-authored-by: Romain Ruetschi <romain@informal.systems>
Co-authored-by: Luca Joss <luca@informal.systems>

* Add `memo_overwrite` configuration (informalsystems#3863)

* Add configuration to overwrite relayer memo

* Add test for memo override

* Add 'memo_overwrite' config documentation

* Add changelog entry

* Recover from gas simulation failures on legacy chains (informalsystems#3793)

Closes: informalsystems#3792

Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>

* Bump tempfile from 3.9.0 to 3.10.1 (informalsystems#3870)

* Bump secp256k1 from 0.28.1 to 0.28.2 (informalsystems#3869)

* Bump anyhow from 1.0.79 to 1.0.80 (informalsystems#3868)

* Bump thiserror from 1.0.56 to 1.0.57 (informalsystems#3866)

* Fix Rust toolchain nightly version to 2024-03-03 for cargo-doc CI job (informalsystems#3875)

* Add configuration to skip packet sequences when clearing (informalsystems#3862)

* Implement packet clearing filtering logic

* Add tests for packet clearing filter

* Add documentation

* Add changelog entry

* Add excluded sequences to `LinkParameters` struct

* Fix sequence filter by adding setting it to be per-channel

* Skip sequence filter test with Celestia due to the token filter module

* Small refactor

* Small cleanup

---------

Co-authored-by: Romain Ruetschi <romain@informal.systems>

* Improve out of gas error log (informalsystems#3874)

* Add additional information for out of gas error

* Add guide entry for troubleshooting gas errors

* Add changelog entry

* Apply suggestions from code review

Co-authored-by: Romain Ruetschi <romain@informal.systems>
Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>

---------

Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>
Co-authored-by: Romain Ruetschi <romain@informal.systems>

* Release Hermes v1.8.1 (informalsystems#3876)

* Build release changelog

* Bump version number

* Fix typos

* Update CHANGELOG.md

Co-authored-by: Romain Ruetschi <romain@informal.systems>
Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>

* Update changelog

---------

Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>
Co-authored-by: Romain Ruetschi <romain@informal.systems>

* Fix Docker image workflow

* Bump serde from 1.0.195 to 1.0.197 (informalsystems#3884)

Bumps [serde](https://github.com/serde-rs/serde) from 1.0.195 to 1.0.197.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](serde-rs/serde@v1.0.195...v1.0.197)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix parsing of IBC-Go version in health check and improve health check reporting (informalsystems#3888)

* Add Injective to chains running tests in CI (informalsystems#3886)

* Add Injective to chains running tests in CI

* Add changelog and Injective to multi-chains-test

* Fix ICS29 timeout fee test compatibility for ibc-go v8.1+

* Add compatibility to ICS29 tests for ibc-go v8.1+

* Fix typos

* Fix `clear packets` CLI bug where `counterparty_channel_id` cannot be found (informalsystems#3890)

* Use correct counterparty channel and port id when creating reverse link in packet clearing CLI

* Add changelog

* Change connection and handshake retry strategy to retry max 10 times over two blocks (5 times per block) (informalsystems#3864)

* Change connection and handshake retry strategy to retry max 10 times over two blocks (5 times per block)

* Add changelog entry

* Update 3864-handshake-retry.md

Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>
Signed-off-by: Romain Ruetschi <github@romac.me>

* Release v1.8.2 (informalsystems#3891)

* Bump version to 1.8.2

* Fix warnings on latest nightly

* Create v1.8.2 changelog

* Rephrase changelog summary

* Reword changelog

* Update CHANGELOG.md

Co-authored-by: Luca Joss <luca@informal.systems>
Signed-off-by: Romain Ruetschi <romain@informal.systems>

* fix: fixed minimum-gas-prices healthcheck messages and make it more verbose/clear (informalsystems#3898)

* fix: fixed minimum-gas-prices healthcheck messages and make it more verbose/clear

* Update changelog entry

* Small refactor

---------

Co-authored-by: Romain Ruetschi <romain@informal.systems>

* Proceed to next block after a few retries if Hermes can't parse current block during event sourcing (informalsystems#3906)

Co-authored-by: Romain Ruetschi <romain@informal.systems>

* Use workspace dependencies (informalsystems#3907)

* Set `compat_mode` for pull mode in `hermes listen` command (informalsystems#3911)

* set compat_mode for pull mode

* add CHANGELOG

* Use constant backoff in handshake retry strategy (informalsystems#3900)

* Add action to determine and check the MSRV (informalsystems#3909)

* Use actions-rust-lang action for setting up Rust

* Add action to determine and check the MSRV

* Update MSRV to 1.71.1

* Check MSRV on CI

* Fix warning

* Update cargo-doc nightly

* Use latest cargo-msrv

* Update guide/src/quick-start/pre-requisites.md

Signed-off-by: Romain Ruetschi <github@romac.me>

---------

Signed-off-by: Romain Ruetschi <github@romac.me>

* Revert "Build multi-platform image on macOS runner to speed up build (informalsystems#3843)" (informalsystems#3892)

This reverts commit 7cfb234.

* Bump crossbeam-channel from 0.5.11 to 0.5.12 (informalsystems#3918)

Bumps [crossbeam-channel](https://github.com/crossbeam-rs/crossbeam) from 0.5.11 to 0.5.12.
- [Release notes](https://github.com/crossbeam-rs/crossbeam/releases)
- [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md)
- [Commits](crossbeam-rs/crossbeam@crossbeam-channel-0.5.11...crossbeam-channel-0.5.12)

---
updated-dependencies:
- dependency-name: crossbeam-channel
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump async-trait from 0.1.77 to 0.1.79 (informalsystems#3919)

Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.77 to 0.1.79.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](dtolnay/async-trait@0.1.77...0.1.79)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update release-template.md to consistently notify comms team (informalsystems#3908)

Trying to find a way for the marketing/comms team to get early notification, consistently, that a new release will happen.

@romac or Luca (not tagging b/c of holidays) if you prefer other approaches then happy to go with something else.

Signed-off-by: Adi Seredinschi <adi@informal.systems>

* Use pull event source when generating configuration with `hermes config auto` (informalsystems#3920)

* Use pull event source when generating configuration with `hermes config auto`

* Add changelog entry

* WIP: begin v2 Penumbra support in Hermes

This starts constructing a v2 of Penumbra support in Hermes, using the
multi-chain backend support we added.

We should be careful to write this code so that it's _maintainable_ as a set of
changes on top of upstream Hermes.  Thus, whenever I've encountered things that
should change about the Hermes code generally, I've marked them with
`TODO(extract)`. That way, we can note changes we'd like to make / need to make
along the way, send them upstream as small, self-contained PRs, and then rebase
this work once they're merged.

The architecture we upstreamed should allow us to keep all the Penumbra-related
code in a submodule, isolated from the rest of the Hermes code, so that future
rebasing won't generate conflicts -- and ideally we could even merge support
upstream, though that will probably need to wait until we can publish all the
Penumbra crates.

* add config fields probably needed for tm-related configs

* start trying to run a view server

* add cargo config needed to build penumbra crates

* bootstrap boxed view service, implement health check and subscribe

* implement some client and connection queries

* implement the majority of query and event parsing

* first pass at implementing transaciton building

* fix conversion to IbcRelay

* fixup remaining build errors, add compat mode to penumbra config

* add tendermint light client to penumbra endpoint

* implement penumbra chain::build_header

* implement penumbra chain::verify_header and check_misbheavior

* implement async tx submitting, add view_service_storage_dir config

* add config-preview-celestia.toml config

* extract client settings to a chain-agnostic structure

* channel creation working: implement workaround for loadbalancer issue, application status

* implement balance query using view client

* async packet relay working now, refactored into build_penumbra_tx

* build unbonding period using penumbra app parameters

* rename celestia config

* fixup a few missing penumbra branches in tests

* add example osmosis config

We already have example configs for a few chains, and our internal
developer docs reference the file `config-penumbra-osmosis.toml`,
so committed a lightly-sanitized version of same for clarity.

* Merge Astria support into Penumbra branch (#16)

* update rustfmt

* implement `AstriaEndpoint` (#1)

* begin AstriaEndpoint impl; update rustfmt

* impl send_messages and verify_header

* check_misbehaviour and bootstrap

* bump deps and work on query methods

* implement rest of queries, lint

* getting stuff running! hermes working but issue on astria

* increase default rpc timeout to 60s

* add proof specs for astria

* handle nonce in astria endpoint

* use no_prehash for proof spec for now

* update localnet config for osmosis

* cleanup

* fix astria address creation and cleanup

* implement ICS20Withdrawal on astria for transfer command

* merge w upstream (#2)

* merge with upstream

* add celestia config

* remove unused config

* bump astria, penumbra, and ibc-proto deps

* Resolve penumbra dependencies across astria/penumbra

* Use ibc-types v0.12.0

* Fix compilation with a bunch of todos

* cargo fmt

* Fix some test imports

* Test fixes for Penumbra/Astria

* Fix compilation and move penumbra/astria dependencies to recent versions

---------

Co-authored-by: elizabeth <elizabethjbinks@gmail.com>
Co-authored-by: noot <36753753+noot@users.noreply.github.com>

* Missing Astria fixes from previous merge

* Run CI on main branch instead of master (#20)

* clippy

* Change tokio features to include full...

* try to fix CI with different dependency versions

* config -> config.toml

* Use tokio_unstable in github actions

* Update min rust-version to 1.77.1

* msrv -> 1.77.1

* Fix clippy failures

* skip Cargo.lock in codespell

* Use tokio_unstable in all the GitHub Actions

* Use old version of penumbra imports for astria

* Further PR cleanup

* remove redundant imports

* Fix cargo-doc CI

* Try to fix integration test disk space error

* Use build cache and index page and unstable options for rust doc build

* Further CI fixes for cargo doc build

* Fix CI template

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Adi Seredinschi <adi@informal.systems>
Signed-off-by: Sergey <83376337+freak12techno@users.noreply.github.com>
Signed-off-by: Sean Chen <seanchen11235@gmail.com>
Signed-off-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>
Signed-off-by: Romain Ruetschi <github@romac.me>
Signed-off-by: Romain Ruetschi <romain@informal.systems>
Co-authored-by: Adi Seredinschi <adi@informal.systems>
Co-authored-by: Romain Ruetschi <romain@informal.systems>
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Luca Joss <luca@informal.systems>
Co-authored-by: Romain Ruetschi <106849+romac@users.noreply.github.com>
Co-authored-by: Sergey <83376337+freak12techno@users.noreply.github.com>
Co-authored-by: Luca Joss <43531661+ljoss17@users.noreply.github.com>
Co-authored-by: Sean Chen <seanchen11235@gmail.com>
Co-authored-by: Martin Dyring-Andersen <martin@dyring-andersen.dk>
Co-authored-by: Jayden Lee <41176085+tkxkd0159@users.noreply.github.com>
Co-authored-by: Henry de Valence <hdevalence@penumbralabs.xyz>
Co-authored-by: Ava Howell <ava@avahowell.me>
Co-authored-by: Conor Schaefer <conor@penumbralabs.xyz>
Co-authored-by: elizabeth <elizabethjbinks@gmail.com>
Co-authored-by: noot <36753753+noot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve resilience to failed packets on ordered channels
5 participants