Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification of naming and structure used in Logs Data Model #4175

Open
arno-jueschke opened this issue Aug 1, 2024 · 4 comments
Open

Clarification of naming and structure used in Logs Data Model #4175

arno-jueschke opened this issue Aug 1, 2024 · 4 comments
Labels
spec:logs Related to the specification/logs directory triage:deciding:needs-info Not enough information. Left open to provide the author with time to add more details

Comments

@arno-jueschke
Copy link

In the Logs Data Model (https://opentelemetry.io/docs/specs/otel/logs/data-model/) specification a log record is defined containing following fields:

Timestamp
ObservedTimestamp
TraceId
SpanId
TraceFlags
SeverityText
SeverityNumber
Body
Resource
InstrumentationScope
Attributes

The protobuf definition (https://github.com/open-telemetry/opentelemetry-proto/blob/v1.3.2/opentelemetry/proto/logs/v1/logs.proto) for LogRecord uses these fields:

time_unix_nano
observed_time_unix_nano
severity_number
severity_text
body
attributes
flags
trace_id
span_id

The example for log record in json (https://github.com/open-telemetry/opentelemetry-proto/blob/v1.3.2/examples/logs.json) uses:

timeUnixNano
observedTimeUnixNano
severityNumber
severityText
traceId
spanId
body
attributes

Suppose, someone wants to store log records as json documents in a log file as compliant as possible to the Logs Data Model. The log records are emitted from several components, let's say.

  • Which field naming (out of the 3 variants above) should be used? (see also Note 1)
  • Are InstrumentationScope and Resource expected to be part of each individual log record or outside as in protobuf definition and the example json? (see also Note 2)

Note:

  1. For semantic conventions (e.g., https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/README.md) the definition is more precise (strict).
  2. The Logs Data Model explains for InstrumentationScope and Resource that "Multiple occurrences of events coming from the same scope can happen across time and they all have the same value".

What did you expect to see?
Guidance on usage of consistent naming

@arno-jueschke arno-jueschke added the spec:logs Related to the specification/logs directory label Aug 1, 2024
@svrnm
Copy link
Member

svrnm commented Aug 5, 2024

Hey @arno-jueschke,

thank you for raising this issue. The difference in notation you see comes from different requirements:

So, to answer your question, it depends on the guidelines&best practices of the solutions you are using. Making an assumption here based on the "JSON Documents in a log file", I would suggest you follow the JSON mapping (camelCase) as suggested in the OTLP spec.

Same for your second question, it depends on what you use and your use cases. Storing Instrumentation Scope & Resources with each record individually, has different advantages/disadvantages to grouping them, or storing them in a separate place and create a relationship). You need to make the analysis yourself, depending on what you'd like to accomplish, e.g. is storage more important to you, or quick access, or to convert back-and-forth into different formats, etc. will lead to different answers.

@svrnm svrnm added the triage:deciding:needs-info Not enough information. Left open to provide the author with time to add more details label Aug 5, 2024
@arno-jueschke
Copy link
Author

Hello @svrnm ,

thank you for the answer. To summarize, the logs data model specifies the content from a conceptual point of view. The concrete field names depend on the used technology and the conventions there.

Is the same true for the semantic conventions?

@svrnm
Copy link
Member

svrnm commented Aug 9, 2024

thank you for the answer. To summarize, the logs data model specifies the content from a conceptual point of view. The concrete field names depend on the used technology and the conventions there.

That's my understanding, yes. But I also defer that from reading the specification

Is the same true for the semantic conventions?

I don't know, that's a question worth asking in the sem conv repo.

@github-actions github-actions bot added the triage:followup Needs follow up during triage label Sep 19, 2024
@mtwo
Copy link
Member

mtwo commented Oct 1, 2024

Hey @arno-jueschke! Given that you're writing log files, I'd just write them in the OTLP JSON format, as that's consistent with OTLP and is already the format that the Collector OTLP file exporter writes data to disk with.

If this answers your question, can you close this issue?

@mtwo mtwo removed the triage:followup Needs follow up during triage label Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:logs Related to the specification/logs directory triage:deciding:needs-info Not enough information. Left open to provide the author with time to add more details
Projects
None yet
Development

No branches or pull requests

3 participants