Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to send Liberty Audit Logs to OpenTelemetry #29229

Open
34 of 51 tasks
donbourne opened this issue Jul 29, 2024 · 6 comments
Open
34 of 51 tasks

Provide a way to send Liberty Audit Logs to OpenTelemetry #29229

donbourne opened this issue Jul 29, 2024 · 6 comments
Assignees
Labels
Design Approved easej Epic Used to track Feature Epics that are following the UFO process Feature ID Required In Progress Items that are in active development. release:25002-beta target:beta The Epic or Issue is targetted for the next beta target:ga The Epic is ready for focal approvals, after which it can GA. target:25002-beta target:25003 team:Lumberjack Translation - Not Required Feature does not require Translation

Comments

@donbourne
Copy link
Member

donbourne commented Jul 29, 2024

Description

We need a way for users to be able to direct their Liberty audit and access logs to OpenTelemetry.


Documents

When available, add links to required feature documents. Use "N/A" to mark particular documents which are not required by the feature.


Process Overview

General Instructions

The process steps occur roughly in the order as presented. Process steps occasionally overlap.

Each process step has a number of tasks which must be completed or must be marked as not applicable ("N/A").

Unless otherwise indicated, the tasks are the responsibility of the feature owner or a delegate of the feature owner.

If you need assistance, reach out to the OpenLiberty/release-architect.

Important: Labels are used to trigger particular steps and must be added as indicated.


Prioritization (Complete Before Development Starts)

The OpenLiberty/chief-architect and area leads are responsible for prioritizing the features and determining which features are being actively worked on.

Prioritization


Design (Complete Before Development Starts)

Design preliminaries determine whether a formal design, which will be provided by an Upcoming Feature Overview (UFO) document, must be created and reviewed. A formal design is required if the feature requires any of the following: UI, Serviceability, SVT, Performance testing, or non-trivial documentation/ID. Furthermore, each identified item places a blocking requirement on another team so it must be identified early in the process. The feature owner may check-off the item if they know it doesn't apply, but otherwise they should work with the focal point to determine what work, if any, will be necessary and make them aware of it.

Design Preliminaries

  • UI requirements identified, or N/A. (Feature owner and UI focal point)
  • Accessibility requirements identified, or N/A. (Feature owner and Accessibility focal point)
  • ID requirements identified, or N/A. (Feature owner and ID focal point)
    • Refer to Documenting Open Liberty.
    • Feature owner adds label ID Required, if non-trivial documentation needs to be created by the ID team.
    • ID adds label ID Required - Trivial, if no design will be performed and only trivial ID updates are needed.
  • Serviceability requirements identified, or N/A. (Feature owner and Serviceability focal point)
  • SVT requirements identified, or N/A. (Feature owner and SVT focal point)
  • Performance testing requirements identified, or N/A. (Feature owner and Performance focal point)

Design

  • POC Design / UFO review requested.
    • Feature owner adds label Design Review Request
  • POC Design / UFO review scheduled.
    • Follow the instructions in POC-Forum repo
  • POC Design / UFO review completed.
  • POC / UFO Review follow-ons completed.
  • POC Design / UFO approval requested.
    • Feature owner adds label Design Approval Request
  • Design / UFO approved. (OpenLiberty/chief-architect) or N/A
    • (OpenLiberty/chief-architect) adds label Design Approved
    • Add the public link to the UFO in Box to the Documents section.
    • The UFO must always accurately reflect the final implementation of the feature. Any changes must be first approved. Afterwards, update the UFO by creating a copy of the original approved slide(s) at the end of the deck and prepend "OLD" to the title(s). A single updated copy of the slide(s) should take the original's place, and have its title(s) prepended with "UPDATED".

No Design

  • No Design requested.
    • Feature owner adds label No Design Approval Request
  • No Design / No UFO approved. (OpenLiberty/chief-architect) or N/A
    • Approver adds label No Design Approved
  • Feature / Capability stabilization or discontinuation or N/A
    • Feature owner adds label Product Management Approval Request and notifies OpenLiberty/product-management
    • Approver adds label Product Management Approved (OpenLiberty/product-management)
    • Note: For stabilized, superseded, and discontinued feature/capability, skip the Beta section of the template (you may delete it). Otherwise, proceed as normal.

FAT Documentation


Implementation

A feature must be prioritized before any implementation work may begin to be delivered (inaccessible/no-ship). However, a design focused approach should still be applied to features, and developers should think about the feature design prior to writing and delivering any code.
Besides being prioritized, a feature must also be socialized (or No Design Approved) before any beta code may be delivered. All new Liberty content must be inaccessible in our GA releases until it is Feature Complete by either marking it kind=noship or beta fencing it.
Code may not GA until this feature has obtained the Design Approved or No Design Approved label, along with all other tasks outlined in the GA section.

Feature Development Begins

  • Add the In Progress label

Legal and Translation

In order to avoid last minute blockers and significant disruptions to the feature, the legal items need to be done as early in the feature process as possible, either in design or as early into the development as possible. Similarly, translation is to be done concurrently with development. All items below MUST be completed before beta & GA is requested.

Innovation (Complete 1 week before Beta & GA Feature Complete Date)

  • Consider whether any aspects of the feature may be patentable. If any identified, disclosures have been submitted.

Legal (Complete before Beta & GA Feature Complete Date)

  • Changed or new open source libraries are cleared and approved, or N/A. (Legal Release Services/Cass Tucker/Release PM).

Translation (Complete by Beta & GA Feature Complete Date)

  • PII (Program Integrated Information) updates are merged (i.e. all English strings due for translation have been delivered), or N/A.

Beta

In order to facilitate early feedback from users, all new features and functionality should first be released as part of a beta release.

Beta Code

  • Beta fence the functionality
    • E.g. kind=beta, ibm:beta, ProductInfo.getBetaEdition()
  • Beta development complete and feature ready for inclusion in a beta release
    • Add label target:beta and the appropriate target:YY00X-beta (where YY00X is the targeted beta version) to the feature issue.
      • Note: This is expected to be done only once, for the initial beta that includes this feature. You do not need to add a target:YY00(X+1)-beta, target:YY00(X+2)-beta, etc. label for each additional beta that includes this feature.
  • Feature delivered into beta

Beta Blog (Complete by beta eGA)

  • Beta blog issue created and populated using the Open Liberty BETA blog post template.
    • Add a link to the beta blog issue in the Documents section.
    • Note: This is for inclusion into the overall beta release blog post. If, in addition, you'd also like to create a dedicated blog post about your feature, then follow the "Standalone Feature Blog Post" instructions under the Other Deliverables section.
    • A feature may have multiple beta blogs associated with it. This is especially useful for features that are continuously adding functionality each release and want to advertise what is new since the previous beta.
      • Each beta blog issue should have the appropriate target:YY00X-beta label added to it.
      • Include each beta blog issue in the Documents section.

GA

A feature is ready to GA after it is Feature Complete and has obtained all necessary Focal Point Approvals.

Feature Complete

  • Feature implementation and tests completed.
    • All PRs are merged.
    • All related/child issues are closed.
    • All stop ship issues are completed.
  • Legal: all necessary approvals granted.
  • Innovation: IP identified and any applicable disclosures submitted
  • Translation: Feature may only proceed to GA if it has either Translation - Not Required, Translation - Complete, or Translation - Missing label
    • If the feature does not have anything that required translation, the feature owner adds the label Translation - Not Required.
    • If all translation has been delivered to release branch, feature owner adds label Translation - Complete.
    • If missing translation does not cause a break in functionality, nor a security or production outage risk, feature owner adds label Translation - Missing.
      • Once all missing translations are delivered, the Translation - Missing label is replaced with Translation - Complete.
    • If missing translation could cause a break in functionality or a security or production outage risk, feature owner adds the Translation - Blocked label.
      • Features with Translation - Blocked may NOT proceed to GA until the label has been replaced with either Translation - Missing or Translation - Complete.
    • For further guidance, contact Globalization focal point or the Release Architect.
  • GA development complete and feature ready for inclusion in a GA release
    • Add label target:ga and the appropriate target:YY00X (where YY00X is the targeted GA version).
    • Inclusion in a release requires the completion of all Focal Point Approvals.

Focal Point Approvals (Complete by Feature Complete Date)

These occur only after GA of this feature is requested (by adding a target:ga label). GA of this feature may not occur until all approvals are obtained.

All Features

  • APIs/Externals - Externals have been reviewed or N/A. (OpenLiberty/externals-approvers)
    • Approver adds label focalApproved:externals
  • Demo - Demo is scheduled for an upcoming EOI or N/A. (OpenLiberty/demo-approvers)
    • Add comment @OpenLiberty/demo-approvers Demo scheduled for EOI [Iteration Number] to this issue.
    • Approver adds label focalApproved:demo.
  • FAT - All Tests complete and running successfully in SOE or N/A. (OpenLiberty/fat-approvers)
    • Approver adds label focalApproved:fat.

Design Approved Features

  • ID - Documentation is complete or N/A. (OpenLiberty/id-approvers)
    • Approver adds label focalApproved:id.
    • NOTE: If only trivial documentation changes are required, you may reach out to the ID Feature Focal to request a ID Required - Trivial label. Unlike features with regular ID requirement, those with ID Required - Trivial label do not have a hard requirement for a Design/UFO.

  • InstantOn - InstantOn capable or N/A. (OpenLiberty/instantOn-approvers)
    • Approver adds label focalApproved:instantOn.
  • Performance - Performance testing is complete or N/A. (OpenLiberty/performance-approvers)
    • Approver adds label focalApproved:performance.
  • Serviceability - Serviceability has been addressed or N/A. (OpenLiberty/serviceability-approvers)
    • Approver adds label focalApproved:sve.
  • STE - Skills Transfer Education chart deck is complete or N/A. (OpenLiberty/ste-approvers)
    • Approver adds label focalApproved:ste.
  • SVT - System Verification Test is complete or N/A. (OpenLiberty/svt-approvers)
    • Approver adds label focalApproved:svt.

Remove Beta Fencing (Complete by Feature Complete Date)

  • Beta guards are removed, or N/A
    • Only after all necessary Focal Point Approvals have been granted.

GA Blog (Complete by Friday after GM)

  • GA Blog issue created and populated using the Open Liberty GA release blog post template.
    • Add a link to the GA Blog issue in the Documents section.
    • Note: This is for inclusion into the overall release blog post. If, in addition, you'd also like to create a dedicated blog post about your feature, then follow the "Standalone Feature Blog Post" instructions under the Other Deliverables section.

Post GM (Complete before GA)

  • After confirming this feature has been included in the GM driver, feature owner closes this issue.

Post GA


Other Deliverables


@donbourne donbourne added Epic Used to track Feature Epics that are following the UFO process team:Lumberjack Feature labels Jul 29, 2024
@malincoln malincoln added Prioritization - Requested The feature is being requested to be added to the backlog for prioritization and removed Prioritization - Requested The feature is being requested to be added to the backlog for prioritization labels Sep 5, 2024
@pgunapal pgunapal added In Progress Items that are in active development. ID Required labels Oct 1, 2024
@donbourne
Copy link
Member Author

donbourne commented Nov 3, 2024

Logs vs. Events

  • In addition to logs, OpenTelemetry has the concept of events (from: https://opentelemetry.io/docs/specs/otel/logs/event-api/):

    • All Events have a event.name attribute, and all Events with the same event.name MUST conform to the same schema for both their Attributes and their Body.
  • It is appropriate to use the Event API when these properties fit your requirements:

    • Logging from a shared library that must run in many applications.
    • A semantic convention needs to be defined. We do not define semantic conventions for LogRecords that are not Events.
    • Analysis by an observability platform is the intended use case. For example: statistics, indexing, machine learning, session replay.
    • Normalizing logging and having a consistent schema across a large application is helpful.
  • If any of these properties fit your requirements, we recommend using the Event API. Events are described in more detail in the semantic conventions.

  • Events API looks like it will be rolled into the Logs Instrumentation API (https://opentelemetry.io/docs/specs/otel/logs/event-api/#logs-instrumentation-api-development):

    Logs Instrumentation API Development
    [!NOTE] We are currently in the process of defining a new Logs Instrumentation API.

    The intent is that this Logs Instrumentation API will incorporate the current functionality of this existing Events API and once it is defined and implemented, the Events API usage will be migrated, deprecated, renamed and eventually removed.

    No further work is scheduled for the current Events API definition at this time.

  • Semantic conventions for events (https://opentelemetry.io/docs/specs/semconv/general/events/)

    • Recommendations for defining events:

      • Use the payload (body) to represent the details of the event instead of a collection of standard attributes.
      • Events SHOULD be generated / produced / recorded using the Event API to ensure that the event is created using the configured SDK instance.
      • The Event API is not yet available in all OpenTelemetry SDKs.
      • TODO: Add deep link to the compliance matrix of the Event API when it exists.
      • It’s NOT RECOMMENDED to prefix the payload (body) fields with the event.name to avoid redundancy and to keep the event definition clean.
      • The events SHOULD document their semantic conventions including event name, attributes, and the payload.
    • Recommendations on using attributes vs. body fields:

      • If the field should be comparable across every type of record, it should be an attribute.
      • If the field is specific to the event itself, then it should be a body field.
      • Unless the same event.name exists on two events, anything in two event bodies is not comparable to each other.
  • Events API is here:

  • A data model exists for access logs that looks like it is done entirely with log attributes (note that this is not part of the spec and is provided as an example):

My thoughts

  • The Events API seems to fit our use case better than logs, because:

    • audit and access logs have predetermined schemas that don't change for a given event name
    • Analysis by an observability platform is the intended use case
    • it is foreseeable that a semantic convention will be eventually defined for access log records
  • That said, we can't rely on the events API today:

    • Semantic conventions for Events is "experimental"
    • Events API is "development"
    • Events API will eventually go away "The intent is that this Logs Instrumentation API will incorporate the current functionality of this existing Events API and once it is defined and implemented, the Events API usage will be migrated, deprecated, renamed and eventually removed."
  • That implies we have a choice:

    • wait for Events API to stabilize before adding access and audit logs to mpTelemetry
    • implement access and audit logs now using logging bridge (putting everything into attributes)
    • given events are emitted as log records, use the logging bridge to log audit and access logs, use the body as described for events, and only use attributes for fields that are common across events. That would require us to make an assumption about what the body should look like since the events api has the job of taking key/value pairs and representing them in the body of the log record when the event is emitted (see https://javadoc.io/doc/io.opentelemetry/opentelemetry-api-incubator/latest/io/opentelemetry/api/incubator/events/EventBuilder.html)

@donbourne
Copy link
Member Author

comments on first draft of UFO

p. 5
    rephrase... currently it looks like you're saying the OTel community and others are our customers

p. 10
    - I thought the CADF format was a JSON format. Not clear what the first bullet means by saying "either CADF or JSON format"
    - OpenTelemetry Attribute semantic naming convention -> OpenTelemetry Semantic Convention's Attribute Naming guidelines (https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/attribute-naming.md)

p. 11, 38-
    - should we include 'audit' in the prefix?

- will trace and span ids be included in audit events?
- need to call out consideration of logs vs. events - https://github.com/OpenLiberty/open-liberty/issues/29229#issuecomment-2453626481

@donbourne
Copy link
Member Author

UFO socialization comments:

page 9
    should say audit-2.0 in bottom box on diagram

p. 10
    perhaps we should match how OTel Events represent data in the body, even if we're not using the Events API due to it not being stable

p. 11
    may be missing audit records related to REST

p. 12
    we need to carefully consider event (body) vs attribute representation of fields
    check if "event.name" is stable attribute name
    consider using event.sequence_number and event.time instead of camel case

    Alasdair: need to clarify what otel says on dots vs. underscores in right column
        Alasdair suggests we should just use dots for hierarchy, but Don thinks otel has different guidance for that

    Jared: do we need both ibm_datetime and ibm_audit_eventTime?

p. 13
    need to have one format for timestamps
        could consider a local timezone for the event_time

p. 19
    for day 2 ops, can you change list of sources dynamically? 
        yes

p. 24
    are early events from before logging starts sent to audit buffered
        we don't have that concept in Liberty - logging starts before things that need to be audited
        
p. 28 
    don't say "for the upcoming beta" - "in an upcoming beta" 

we don't want audit-1.0/2.0 to enable mpTelemetry-2.0 (that would require an audit-3.0)
    Jared: feature design should mention how we're enabling this

p. 29
    need more detail on this slide for what we will test

p. 30
    ensuring events are captured should be a function test, not a system test
    need to be clear about system test cases we need
    may want a long run for this for SVT

@NottyCode
Copy link
Member

@fmhwong the link to the UFO appears broken.

@pgunapal
Copy link
Member

@NottyCode Updated the UFO link... not sure how it became broken.

@pgunapal
Copy link
Member

ID doc issue : OpenLiberty/docs#7829

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Approved easej Epic Used to track Feature Epics that are following the UFO process Feature ID Required In Progress Items that are in active development. release:25002-beta target:beta The Epic or Issue is targetted for the next beta target:ga The Epic is ready for focal approvals, after which it can GA. target:25002-beta target:25003 team:Lumberjack Translation - Not Required Feature does not require Translation
Projects
Status: Observability
Development

No branches or pull requests

7 participants