-
Notifications
You must be signed in to change notification settings - Fork 1.6k
observability: tracing gum, automatically cross ref traceID #5079
Conversation
Have we already created or do we have plans to create per-candidate spans that follow it through all stages (collation, backing, approval, disputes)? Or is this cross-referencing orthogonal? My concern is orphaned traces, where we have candidates logged in approval/disputes and either get linked into the void or to some backing span that is unrelated. |
The linking meant here is between logs in Loki and spans in tempo, such that the grafana UI works properly. We don't have spans beyond subsystems as far as I can remember, but as long as they all reference the same traced derived from a candidate hash, we should will be able to track them across stages by tag and stage annotations for the individual spans. Having orphans hence shouldn't be that big of an issue, at least I don't see how it would impair the debugging flow. |
Approval and disputes are subsystems. and my point is that there are few/no spans there, but many log lines referring to candidate hash. Unclear what should happen if/when we add approval/disputes spans. |
The cross linking is also bounded by time, which I did not mention, so this should further restrict the amount of query results |
7c6ca63
to
49cddbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear if candidate_hash
based traceID is the only one we'll be interested in. Looks good otherwise.
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
dbc133f
to
cdab19e
Compare
For the time being, we'll stick to candidate hash, and check if we can have multiple identically keyed tags on a span, it has to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ❤️ it - adds no extra LoCs. Otherwise, I think we should add a basic integration test in this PR or a followup.
let Args { target, comma, mut values, fmt } = args; | ||
|
||
// find a value or alias called `candidate_hash`. | ||
let maybe_candidate_hash = values.iter_mut().find(|value| value.as_ident() == "candidate_hash"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a followup PR: Is it possible to remove the hardcoding here to derive the trace_id from something specified by the user ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically yes, but that would add additional syntax. Open to discuss something along the lines of foo @ traceID
though, out of scope for this PR though
} | ||
}) | ||
} else { | ||
Ok(quote! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a fall back to relay_parent
add value ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure yet, it could also be more confusing since all of a sudden you have mix of relay chain and parachain hashes in the same view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think relay_parent
might be another tracing id in conjunction with candidate_hash
as it is available in more places than candidate_hash
. But I also think that these IDs should be kept separately, as fallback would only mess things up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would mess things up, having a secondary trace id is what makes sense here. Falling back to a secondary trace id is what I really meant.
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
bot merge |
Follow up to #5067
Will provide a
tracing::{warn,error,debug,trace,info}
compatible set of macros, but tracks any occurrences of identifiers or aliases that matchcandidate_hash
and derives atraceID = ...
from those and provides that to thetracing::event!($args)
as separate key value pair. This being in place should be sufficient for grafana to cross link logs to zipkin spans.Bottom line using
gum::warn!
instead oftracing::warn!
should be used to get implicit cross ref based on thecandidate_hash
derived trace identifier in the grafana tempo ui.CC @lazam