Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relay on all paths defined in the configuration #742

Merged
merged 22 commits into from
Mar 30, 2021
Merged

Conversation

romac
Copy link
Member

@romac romac commented Mar 12, 2021

Closes: #748

Description

If the start-multi command is ran without options, spawn a supervisor per connection defined in the config.

This PR is not ready yet because in a 3 chains setup, some events attached to an object that is not relevant to a worker are forwarded to it anyways.

To fix this, we must perform some queries to ensure that the object attached to an event is relevant to the two chains the supervisor is responsible for.

TODO

  • Open issue to track progress on this PR
  • Ensure that the Object assigned to an event is relevant for the pair of chains the supervisor is responsible for.
  • Either avoid spawning workers for irrelevant events or let the worker exit gracefully if it's not relevant.
  • Think about how to deal with worker exiting because of internal error. Will be addressed in future PR that introduces a proper architecture.

For contributor use:

  • Updated the Unreleased section of CHANGELOG.md with the issue.
  • If applicable: Unit tests written, added test to CI.
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Updated relevant documentation (docs/) and code comments.
  • Re-reviewed Files changed in the Github PR explorer.

@romac romac requested a review from ancazamfir March 12, 2021 16:48
Comment on lines +10 to +17
/// Registry for keeping track of [`ChainHandle`]s indexed by a `ChainId`.
///
/// The purpose of this type is to avoid spawning multiple runtimes for a single `ChainId`.
pub struct Registry<'a> {
config: &'a Config,
handles: HashMap<ChainId, Box<dyn ChainHandle>>,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Member

@adizere adizere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Should we update the changelog?

Also, it wasn't clear to me what is the expected behavior. Upon starting with no options, assuming a default configuration (with dev-env) is the expected behavior to just start the supervisor for both chains and then stop?

$ hermes start-multi
    Finished dev [unoptimized + debuginfo] target(s) in 0.17s
     Running `target/debug/hermes start-multi`
{"timestamp":"Mar 16 10:27:00.559","level":"INFO","fields":{"message":"spawning supervisor for chains ibc-0 and ibc-1"},"target":"ibc_relayer_cli::commands::start_multi"}
{"timestamp":"Mar 16 10:27:00.772","level":"INFO","fields":{"message":"running listener","chain.id":"ibc-0"},"target":"ibc_relayer::event::monitor"}
{"status":"success","result":"ok"}
{"timestamp":"Mar 16 10:27:00.875","level":"INFO","fields":{"message":"running listener","chain.id":"ibc-1"},"target":"ibc_relayer::event::monitor"}

@romac
Copy link
Member Author

romac commented Mar 16, 2021

That's an oversight on my part, I forgot to commit the code which waits for the threads to finish.

There are also other issues with this PR which we discussed with @ancazamfir, so I am going to put it back in draft until I address them (see TODO in PR description).

@romac romac marked this pull request as draft March 16, 2021 13:01
@romac romac changed the title Allow spawning a supervisor per connection defined in the config Relay for all paths defined in the configuration Mar 18, 2021
@romac romac changed the title Relay for all paths defined in the configuration Relay on all paths defined in the configuration Mar 21, 2021
@codecov-io

This comment has been minimized.

@romac
Copy link
Member Author

romac commented Mar 23, 2021

Looks good. Should we update the changelog?

Done.

@romac romac marked this pull request as ready for review March 24, 2021 18:53

let src_channel = src_chain.query_channel(src_port_id, src_channel_id, Height::zero())?;

// TODO: Check channel state?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we fix the TODO or is this for later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually more of a question than a TODO, was hoping you or @ancazamfir could advise me as to what to do here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the channel doesn't exist, this query returns an Uninitialized channel with zero connection hops. So we will see an error when we try to get the connection further down.

scripts/dev-env Outdated Show resolved Hide resolved
scripts/init-clients Outdated Show resolved Hide resolved
scripts/setup-chains Outdated Show resolved Hide resolved
@adizere
Copy link
Member

adizere commented Mar 25, 2021

Using the following config h/t @romac

[global]
timeout = '10s'
strategy = 'naive'
log_level = 'info'
[[chains]]
id = 'ibc0'
rpc_addr = 'tcp://localhost:26657'
grpc_addr = 'tcp://localhost:9090'
account_prefix = 'cosmos'
key_name = 'testkey'
store_prefix = 'ibc'
gas = 3000000
clock_drift = '5s'
trusting_period = '14days'
[chains.trust_threshold]
numerator = '1'
denominator = '3'
[[chains]]
id = 'ibc1'
rpc_addr = 'tcp://localhost:26557'
grpc_addr = 'tcp://localhost:9091'
account_prefix = 'cosmos'
key_name = 'testkey'
store_prefix = 'ibc'
gas = 3000000
clock_drift = '5s'
trusting_period = '14days'
[chains.trust_threshold]
numerator = '1'
denominator = '3'
[[chains]]
id = 'ibc2'
rpc_addr = 'tcp://localhost:26457'
grpc_addr = 'tcp://localhost:9092'
account_prefix = 'cosmos'
key_name = 'testkey'
store_prefix = 'ibc'
gas = 3000000
clock_drift = '5s'
trusting_period = '14days'
[chains.trust_threshold]
numerator = '1'
denominator = '3'
[[connections]]
a_chain = 'ibc0'
b_chain = 'ibc1'
[[connections.paths]]
a_port = 'transfer'
b_port = 'transfer'
[[connections]]
a_chain = 'ibc1'
b_chain = 'ibc2'
[[connections.paths]]
a_port = 'transfer'
b_port = 'transfer'
[[connections]]
a_chain = 'ibc2'
b_chain = 'ibc0'
[[connections.paths]]
a_port = 'transfer'
b_port = 'transfer'

@adizere
Copy link
Member

adizere commented Mar 25, 2021

Not a problem, but got me confused (thinking that an error occurred), but upon running:

./scripts/dev-env ~/.hermes/multi_config.toml ibc0 ibc1 ibc2

There is in the output some error-level messages:

...
Removing light client peers from configuration...
error: hermes fatal error: no peers configured for chain: ibc0
error: hermes fatal error: no peers configured for chain: ibc1
error: hermes fatal error: no peers configured for chain: ibc2
...

Again, this is not exactly an error: there are no peers configured for the chain (and therefore they cannot be removed). But others may also find it confusing.

Co-authored-by: Adi Seredinschi <adi@informal.systems>
relayer/src/supervisor.rs Outdated Show resolved Hide resolved
@romac
Copy link
Member Author

romac commented Mar 25, 2021

Not a problem, but got me confused (thinking that an error occurred), but upon running:

./scripts/dev-env ~/.hermes/multi_config.toml ibc0 ibc1 ibc2

There is in the output some error-level messages:

...
Removing light client peers from configuration...
error: hermes fatal error: no peers configured for chain: ibc0
error: hermes fatal error: no peers configured for chain: ibc1
error: hermes fatal error: no peers configured for chain: ibc2
...

Again, this is not exactly an error: there are no peers configured for the chain (and therefore they cannot be removed). But others may also find it confusing.

Yeah that's pretty bad UX, sorry about that. This should be a warning at most or even a different message

@romac
Copy link
Member Author

romac commented Mar 29, 2021

Again, this is not exactly an error: there are no peers configured for the chain (and therefore they cannot be removed). But others may also find it confusing.

Yeah that's pretty bad UX, sorry about that. This should be a warning at most or even a different message

Tracked in #782

Copy link
Collaborator

@ancazamfir ancazamfir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks Romain!

@romac romac merged commit a75cb58 into master Mar 30, 2021
@romac romac deleted the romac/load-chains2 branch March 30, 2021 12:08
@romac romac mentioned this pull request Apr 6, 2021
15 tasks
hu55a1n1 pushed a commit to hu55a1n1/hermes that referenced this pull request Sep 13, 2022
* Make supervisor::collect_events public

* Fix rustdoc warning

* Spawn a supervisor per connection defined in the config if start-multi is ran without options

* Make sure we don't spawn a runtime for the same chain twice

* Use a registry to avoid spawning a chain runtime multiple times

* Wait for all supervisor threads to finish

* Update scripts to spawn three chains

* Disable JSON output for start-multi command

* Show output and errors of hermes commands in init-clients script

* Cleanup

* Only send relevant events to the workers

* Fix typo

* Update scripts to allow spawning either 2 or 3 chains

* Send NewBlock events after all other events

* Update changelog

* Formatting

* Remove commented out code

Co-authored-by: Adi Seredinschi <adi@informal.systems>

* Fix WriteAck object and discard bogus events

* Fix for master merge

* Check channel and connection state when computing counterparty chain

Co-authored-by: Adi Seredinschi <adi@informal.systems>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relay for all paths defined in the configuration
4 participants