Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

observability: tracing gum, automatically cross ref traceID #5079

Merged
merged 23 commits into from
Mar 15, 2022

Conversation

drahnr
Copy link
Contributor

@drahnr drahnr commented Mar 10, 2022

Follow up to #5067

Will provide a tracing::{warn,error,debug,trace,info} compatible set of macros, but tracks any occurrences of identifiers or aliases that match candidate_hash and derives a traceID = ... from those and provides that to the tracing::event!($args) as separate key value pair. This being in place should be sufficient for grafana to cross link logs to zipkin spans.

Bottom line using gum::warn! instead of tracing::warn! should be used to get implicit cross ref based on the candidate_hash derived trace identifier in the grafana tempo ui.

CC @lazam

@github-actions github-actions bot added the A0-please_review Pull request needs code review. label Mar 10, 2022
@drahnr drahnr self-assigned this Mar 10, 2022
@drahnr drahnr added B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Mar 10, 2022
@drahnr drahnr requested a review from sandreim March 10, 2022 16:42
@drahnr drahnr changed the title tracing gum, actually cross ref traceID WIP: tracing gum, actually cross ref traceID Mar 10, 2022
@drahnr drahnr marked this pull request as draft March 10, 2022 16:42
@drahnr drahnr changed the title WIP: tracing gum, actually cross ref traceID WIP: tracing gum, automatically cross ref traceID Mar 10, 2022
@rphmeier
Copy link
Contributor

rphmeier commented Mar 10, 2022

Have we already created or do we have plans to create per-candidate spans that follow it through all stages (collation, backing, approval, disputes)? Or is this cross-referencing orthogonal?

My concern is orphaned traces, where we have candidates logged in approval/disputes and either get linked into the void or to some backing span that is unrelated.

@drahnr
Copy link
Contributor Author

drahnr commented Mar 10, 2022

The linking meant here is between logs in Loki and spans in tempo, such that the grafana UI works properly.

We don't have spans beyond subsystems as far as I can remember, but as long as they all reference the same traced derived from a candidate hash, we should will be able to track them across stages by tag and stage annotations for the individual spans. Having orphans hence shouldn't be that big of an issue, at least I don't see how it would impair the debugging flow.

@rphmeier
Copy link
Contributor

rphmeier commented Mar 10, 2022

The linking meant here is between logs in Loki and spans in tempo, such that the grafana UI works properly.
We don't have spans beyond subsystems as far as I can remember.

Approval and disputes are subsystems. and my point is that there are few/no spans there, but many log lines referring to candidate hash. Unclear what should happen if/when we add approval/disputes spans.

@drahnr
Copy link
Contributor Author

drahnr commented Mar 10, 2022

The cross linking is also bounded by time, which I did not mention, so this should further restrict the amount of query results

@drahnr drahnr force-pushed the bernhard-tracing-gum branch 2 times, most recently from 7c6ca63 to 49cddbf Compare March 11, 2022 09:56
@drahnr drahnr marked this pull request as ready for review March 11, 2022 15:27
@drahnr drahnr added this to the v0.9.19 milestone Mar 11, 2022
@drahnr drahnr changed the title WIP: tracing gum, automatically cross ref traceID observability: tracing gum, automatically cross ref traceID Mar 11, 2022
@drahnr drahnr requested a review from ordian March 11, 2022 15:48
Copy link
Member

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear if candidate_hash based traceID is the only one we'll be interested in. Looks good otherwise.

Cargo.lock Outdated Show resolved Hide resolved
node/gum/proc-macro/src/types.rs Outdated Show resolved Hide resolved
node/gum/proc-macro/Cargo.toml Outdated Show resolved Hide resolved
node/gum/README.md Show resolved Hide resolved
@drahnr drahnr force-pushed the bernhard-tracing-gum branch from dbc133f to cdab19e Compare March 14, 2022 13:54
@drahnr
Copy link
Contributor Author

drahnr commented Mar 15, 2022

It's not clear if candidate_hash based traceID is the only one we'll be interested in. Looks good otherwise.

For the time being, we'll stick to candidate hash, and check if we can have multiple identically keyed tags on a span, it has to be traceID for the log line after all.

Copy link
Contributor

@sandreim sandreim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ❤️ it - adds no extra LoCs. Otherwise, I think we should add a basic integration test in this PR or a followup.

node/gum/README.md Outdated Show resolved Hide resolved
node/gum/proc-macro/src/lib.rs Show resolved Hide resolved
let Args { target, comma, mut values, fmt } = args;

// find a value or alias called `candidate_hash`.
let maybe_candidate_hash = values.iter_mut().find(|value| value.as_ident() == "candidate_hash");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a followup PR: Is it possible to remove the hardcoding here to derive the trace_id from something specified by the user ?

Copy link
Contributor Author

@drahnr drahnr Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically yes, but that would add additional syntax. Open to discuss something along the lines of foo @ traceID though, out of scope for this PR though

}
})
} else {
Ok(quote! {
Copy link
Contributor

@sandreim sandreim Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a fall back to relay_parent add value ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure yet, it could also be more confusing since all of a sudden you have mix of relay chain and parachain hashes in the same view.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think relay_parent might be another tracing id in conjunction with candidate_hash as it is available in more places than candidate_hash. But I also think that these IDs should be kept separately, as fallback would only mess things up.

Copy link
Contributor

@sandreim sandreim Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would mess things up, having a secondary trace id is what makes sense here. Falling back to a secondary trace id is what I really meant.

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
@drahnr
Copy link
Contributor Author

drahnr commented Mar 15, 2022

bot merge

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants