Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured logs for relayer logic #1491

Merged
merged 37 commits into from
Jan 17, 2022
Merged

Structured logs for relayer logic #1491

merged 37 commits into from
Jan 17, 2022

Conversation

adizere
Copy link
Member

@adizere adizere commented Oct 25, 2021

Closes: #1537

Description


For contributor use:

  • Added a changelog entry, using unclog.
  • If applicable: Unit tests written, added test to CI.
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Updated relevant documentation (docs/) and code comments.
  • Re-reviewed Files changed in the Github PR explorer.

@adizere adizere marked this pull request as draft October 25, 2021 16:23
@adizere adizere linked an issue Nov 23, 2021 that may be closed by this pull request
9 tasks
relayer/src/connection.rs Outdated Show resolved Hide resolved
@mzabaluev mzabaluev self-assigned this Dec 17, 2021
@adizere adizere marked this pull request as ready for review December 22, 2021 14:53
config.toml Show resolved Hide resolved
relayer/src/chain/cosmos.rs Outdated Show resolved Hide resolved
relayer/src/chain/cosmos.rs Outdated Show resolved Hide resolved
relayer/src/channel.rs Outdated Show resolved Hide resolved
relayer/src/chain/tx.rs Outdated Show resolved Hide resolved
relayer/src/link/relay_path.rs Show resolved Hide resolved
relayer/src/link/relay_path.rs Outdated Show resolved Hide resolved
relayer/src/link/relay_path.rs Outdated Show resolved Hide resolved
relayer/src/link/relay_path.rs Outdated Show resolved Hide resolved
relayer/src/supervisor/spawn.rs Show resolved Hide resolved
adizere and others added 7 commits December 23, 2021 09:25
Co-authored-by: Mikhail Zabaluev <mikhail@informal.systems>
Add tracking_id method to use in tracing instead.

Also, use less cryptic span names in send_messages_*.
Make the tracking ID construction parameter generic to accept any
Into<String> parameter.
Identify TrackedMsgs batches for creating channels and connections
with the specific message that is used.
Copy link
Contributor

@mzabaluev mzabaluev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I've made some changes that may need another look.

Copy link
Member Author

@adizere adizere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the adjustments Mikhail! I think we should just adjust a bit the tracking ids, because they can make the output lines very long -- I left suggestions for that.

Can we also signal the changes in the changelog? https://github.com/informalsystems/ibc-rs/blob/master/CONTRIBUTING.md#examples

@@ -1111,7 +1112,7 @@ impl ChainEndpoint for CosmosSdkChain {
) -> Result<Vec<Response>, Error> {
crate::time!("send_messages_and_wait_check_tx");

let span = span!(Level::DEBUG, "send", id = %tracked_msgs);
let span = span!(Level::DEBUG, "send_tx_check", id = %tracked_msgs.tracking_id());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is clearer!

relayer/src/channel.rs Outdated Show resolved Hide resolved
relayer/src/channel.rs Outdated Show resolved Hide resolved
relayer/src/channel.rs Outdated Show resolved Hide resolved
relayer/src/channel.rs Outdated Show resolved Hide resolved
relayer/src/connection.rs Outdated Show resolved Hide resolved
@@ -828,7 +828,7 @@ impl<ChainA: ChainHandle, ChainB: ChainHandle> Connection<ChainA, ChainB> {
.map_err(|e| ConnectionError::chain_query(self.dst_chain().id(), e))?;
let client_msgs = self.build_update_client_on_src(src_client_target_height)?;

let tm = TrackedMsgs::new(client_msgs, "create connection");
let tm = TrackedMsgs::new(client_msgs, "update client on source for ConnectionOpenTry");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a shorter way to formulate this without losing context. I guess we can keep it as it is.

relayer/src/connection.rs Outdated Show resolved Hide resolved
relayer/src/connection.rs Outdated Show resolved Hide resolved
relayer/src/link/relay_path.rs Show resolved Hide resolved
Copy link
Member Author

@adizere adizere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we do the release (merge #1712), let's merge this. Thanks Mikhail!

@adizere
Copy link
Member Author

adizere commented Jan 10, 2022

Post-merge artifact that might need our attention: when we encounter an account sequence number mismatch error, we see the following error message missing any tracking id:

2022-01-10T13:45:14.341974Z WARN ThreadId(1723) task PacketWorker(ibc-1:transfer/channel-0 -> ibc-0) encountered ignorable error: link errror: failed with underlying error: gRPC call failed with status: status: InvalidArgument, message: "account sequence mismatch, expected 86, got 85: incorrect account sequence: invalid request", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }

The task lifetime is better tracked with a tracing span,
which also reduces code repetition in tracking macros by putting in
the task name once.
@mzabaluev
Copy link
Contributor

@adizere The problem with that tracing entry is that it is produced higher up in the call stack than the specific task that creates the tracking id.

relayer/src/util/task.rs Outdated Show resolved Hide resolved
spawn_background_task receives a Span constructed by the caller.
This allows embedding contextual information for the task, which is
used to reduce repetition in logging macros for the workers.
Comment on lines +82 to +88
span!(
tracing::Level::ERROR,
"DetectMisbehaviorWorker",
client = %client.id,
src_chain = %client.src_chain.id(),
dst_chain = %client.dst_chain.id(),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit more repetitive than the previous approach, but this allows filtering logs by any of the individual ID fields.

If this level of detail is not desirable, we could implement tracing::Value on the objects that were logged with Display (removing the issue of implementing a general-purpose formatting trait just for logging purposes), and resurrect the previous compact string representation.

Comment on lines +68 to +74
let _span = span!(
tracing::Level::DEBUG,
"DetectMisbehaviorFirstCheck",
client = %client.id,
src_chain = %client.src_chain.id(),
dst_chain = %client.dst_chain.id(),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be folded into the "DetectMisbehaviorWorker" span below, but I thought it would be useful to distinguish these contexts since the other one is performed in a background worker task.

mzabaluev and others added 5 commits January 11, 2022 17:34
With spans injecting all of the information that was formatted
using the RelayPath Display impl, this is redundant.
For the batch worker tasks, we should have a span with the task name
and other details, so the top-level span is not needed.
For other supervisor tasks, the span name is shortened.
We have the span now, these are redundant and bad style.
@adizere
Copy link
Member Author

adizere commented Jan 17, 2022

Mikhail and I had one more sync. session on this PR and we're both in agreement that this is ready for merging. It's a great first step in making the relayer's logs less grating on the eyes (and the codebase more idiomatic). The last nit is the changelog.

Cool stuff.

@mzabaluev mzabaluev merged commit 2757031 into master Jan 17, 2022
@mzabaluev mzabaluev deleted the adi/structured_logs branch January 17, 2022 14:42
hu55a1n1 pushed a commit to hu55a1n1/hermes that referenced this pull request Sep 13, 2022
* Printing tx hashes from SendPacket events.

This was inspired from commit b63335b
(see rpc.rs therein).

* log the tx hashes in ibc_channel event SendPacket

* Redo displaying for `OperationalData`

Add `OperationalInfo` that can hold the displayable data on the batch,
either borrowed or owned with transforming from first to the other.
Implement `Display` on the `OperationalInfo` instead of `OperationalData`
for clarity.

* Improve logging of operational data

Use the `odata` key in tracing.

* Use a tracing span for task log messages

The task lifetime is better tracked with a tracing span,
which also reduces code repetition in tracking macros by putting in
the task name once.

spawn_background_task receives a Span constructed by the caller.
This allows embedding contextual information for the task, which is
used to reduce repetition in logging macros for the workers.

* Erase Display impl on RelayPath, use spans instead

With spans injecting all of the information that was formatted
using the RelayPath Display impl, this is redundant.

* Erase [rest] prefixes from log messages

We have the span now, these are redundant and bad style.

Co-authored-by: Mikhail Zabaluev <mikhail@informal.systems>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Structured logs: add identifiers & tx hashes in log output
2 participants