Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App Analytics dashboards #131

Closed
115 of 116 tasks
anirudha opened this issue Oct 13, 2021 · 13 comments
Closed
115 of 116 tasks

App Analytics dashboards #131

anirudha opened this issue Oct 13, 2021 · 13 comments
Assignees
Labels
enhancement New feature or request v1.3.0

Comments

@anirudha
Copy link
Collaborator

anirudha commented Oct 13, 2021

Overview

Application Analytics allows OpenSearch Dashboards users to select logs, traces, and metrics to be part of an application that can be monitored for overall health and visualized on one page. It also allows users to quickly pivot between traces, metrics, and logs to dig into the source of any issues that arise.

Problem

Currently, users have no way of identifying which trace and log data belongs to which application because all trace data ends up in the same index and log data goes into different independent indices. This application is made up of user-selected logs, metrics, and traces. The current solution forces users to sift through large volumes of data to get to what they really need. Also this data, distributed across Trace Analytics, Event Analytics, Dashboards, does not provide a comprehensive view of an application’s health at a glance.

Requirements

Home Page

  1. Users are able to view existing applications and some high level information (Name, Composition, Current Availability, Availability Metrics)
  2. Users are able to create/rename/duplicate/delete applications

Create Page

Application Information
  1. Users are able to specify name of new application
  2. Users are able to optionally provide a description of their application
Composition
  1. Users are able to configure base query
  2. Users are able to select services and entities to include in the application
  3. Users are able to constrain application to specific trace groups
Availability
  1. Users are able to set custom levels of availability according to query conditions

Application Details

Overview Tab
  1. Users are able to view Latency by Trace Group
  2. Users are able to view default charts: Application Composition Map, Error Rate, Throughput
  3. Users are able to add new charts
Services Tab
  1. Users are able to view information about specific or all services in a table
Traces & Spans Tab
  1. Users are able to view general information about traces and spans in two tables
  2. Users are able to view more detailed information by selecting a trace or span
Log Events
  1. Users are able to query and analyze logs for their application
  2. Users are able to create visualizations that are saved as metrics
Panels Metrics
  1. Users are able to view metrics (visualizations) created from the log events tab
Configuration
  1. Users are able to edit their application composition and availability settings

UX/UI Design

Figma Link
Home Page
Screen Shot 2021-11-17 at 9 37 52 AM

Overview Tab
overview

Services Tab
services

Configuration Tab
config

Development Tasks

Home Page

Create Page

Application Information
Composition

Application Details

Overview Tab (Dashboard from Trace Analytics)
Services Tab (Services from Trace Analytics)
Traces & Spans Tab (Traces and Span Details Table from Trace Analytics)
Log Events (Log Explorer from Event Analytics)
Panels Metrics (Custom Panel View from Operational Panels)
Configuration
Backend
Known Bugs

Solution

Relationship to Other Modules

Dashboard, Trace, Span Detail, Service from Trace Analytics, Log Explorer from Event Analytics, and Panels from Operational Panels will be re-used in Application Analytics. We will need to refactor the components to become more modular. We also need to make sure only the selected services are filtered through when displaying information.

Architecture Diagram

Data Model

Application
  • name: User-defined name of application
  • description (optional): User-defined description of the application
  • baseQuery: PPL query to filter the application by
  • servicesEntities: List of services and entities included in application
  • traceGroups: List of trace groups included in application
    ( - availabilityLevels: User-defined thresholds that signify custom levels of availability of application
    • label
    • levelPriority
    • color
    • condition ) Temporarily not added

Model

{
   "properties": {
        "name": {
            "type": "keyword",
            },
        "description": {
            "type": "text",
            },
        "baseQuery": {
            "type": "text",
            },
        "servicesEntities": {
            "type": "nested",
        },
        "traceGroups": {
            "type": "nested",
        },
    }
}

Example

{
  "properties": {
    "name": "Application 1",
    "description": "This is my app",
    "baseQuery": "source = flight_logs",
    "servicesEntities": ["Payment"],
    "traceGroups" : ["Payment.auto"]
    }
}

Options for Structure of Availability Levels

Option 1 - List of objects with all of the fields
Pros: Simple

"availabilityLevels": [
    {
    "label": "Available",
    "color": "#50C878",
    "condition": "If otherwise",
    "levelPriority": 0,
    }, {
    "label": "Unavailable",
    "color": "#FF0000",
    "condition": "when errorRate() >= 2",
    "levelPriority": 1,
    },
]

Option 2 - List of objects with key/value pairs. (key: levelPriority) (value: label, color condition in object)
Pros: Easier to sort,

"availabilityLevels": [
    0: {
    "content": {
        "label": "Available",
        "color": "#50C878",
        "condition": "If otherwise",
        }
    }, 
    1: {
    "content": {
        "label": "Unavailable",
        "color": "#FF0000",
        "condition": "when errorRate() >= 2",
        }
    },
]

API Design

APP_ANALYTICS_API_PREFIX='/api/observability/application'

  • Fetch all Applications
GET ${APP_ANALYTICS_API_PREFIX}/

RESPONSE BODY
{
  "statusCode": 200,  
  "message": "Ready", 
  "applications": [
    {
      "id": "2FF3GW3H8",
      "name": "Application 1",
      "servicesEntities": ["Payment", "Users"],
      "traceGroups" : ["Payment.auto", "Users.admin"]
    },
    {
      "id": "2FHEP953H",
      "name": "Application 2",
      "servicesEntities": ["Purchase"],
      "traceGroups" : ["Purchase.source"]
    }
  ]
}  
  • Fetch Application by ID
GET ${APP_ANALYTICS_API_PREFIX}/{application_id} 
    
    RESPONSE BODY
    {
      "statusCode": 200, 
      "message": "Application Fetched", 
      "application": {
            "name": "Application 1",
            "description": "Description for the application",
            "base_query": "source = opensearch_sample_database_flights",
            "services_entities": [
                "Payment",
                "Users",
                "Purchase"
            ]
            "trace_groups": [
                "Payment.auto",
                "Users.admin",
                "Purchase.source",
            ]
        }
    }
  • Create Application
POST ${APP_ANALYTICS_API_PREFIX}/
    
    REQUEST BODY
    {
       "name": "Application 1",
        "description": "Description for the application",
        "base_query": "source = opensearch_sample_database_flights",
        "services_entities": [
            "Payment",
            "Users",
            "Purchase"
        ]
        "trace_groups": [
            "Payment.auto",
            "Users.admin",
            "Purchase.source",
        ]
    }
    
    RESPONSE BODY
    {
      "statusCode": 200, 
      "message": "Application Created", 
      "newAppId": "2FG6FWGY5" // New application ID
    }
  • Update Application - all fields are updatable
PATCH ${APP_ANALYTICS_API_PREFIX}/{application_id} 
    
    REQUEST BODY
    {
      "appId": "2FG6FWGY5", // application id to be renamed 
      "base_query": "source = opensearch_sample_database_flights" // fields to update
    } 
    
    RESPONSE BODY
    {
      "statusCode": 200, 
      "message": "Application Updated"
    }
  • Duplicate Application
POST ${APP_ANALYTICS_API_PREFIX}/clone 
    
    REQUEST BODY
    {
      "appId": "2FG6FWGY5", // application id to be renamed 
      "name":"Application 1 (copy)" // new name for the application to be duplicated 
    } 
    
    RESPONSE BODY
    {
      "statusCode": 200, 
      "message": "Application Cloned",  
      "newAppId": "2FFAG7HAQ" // newly duplicated application id  
    }
  • Delete Application
DELETE ${APP_ANALYTICS_API_PREFIX}/{application_ids}
    
    RESPONSE BODY
    {
      "statusCode": 204, 
      "message": "Application Deleted",   
    }

Autocomplete Logic

Possible PPL commands: source, dedup, eval, fields, head, rare, rename, sort, stats, top, where

for Base Query on Create Page
  • Not allowed commands: stats, head, rare, top
  • ex: source = opensearch_dashboard_sample_flights | where DestCityName = 'Venice'
for Availability Level Condition on Create Page (Conditions can not use PPL query)
  • Allowed commands: where, eval, rename
  • Condition must be evaluate to boolean
for Filter on Log Events Tab
  • Not allowed commands: source
  • ex: where OriginCityName = 'Rome' | stats avg( AvgTicketPrice)
for Filter on Traces & Spans Tab (pseudo) (Using filter instead of PPL query)
@anirudha anirudha added the enhancement New feature or request label Oct 13, 2021
@ryn9
Copy link

ryn9 commented Nov 16, 2021

@anirudha
Are these solution patterns being developed with multi-tenancy and various RBAC in mind?
Are there issues and/or other docs that could be linked back to on these solution patterns?

@anirudha
Copy link
Collaborator Author

yes, the observabilty plugin supports RBAC and multi -tenancy using the security plugin

@ryn9
Copy link

ryn9 commented Nov 24, 2021

@anirudha additionally will the plugin support customizable index patterns, etc.. ?
The trace analytics plug-in required use of otel-v1-apm-span-* and otel-v1-apm-service-map*, which will not work in all environments, especially when multi-tenancy comes into play.

@eugenesk24
Copy link
Contributor

@ryn9 Hi Ryan, I talked to Ani about your question. We are planning on supporting regex based index matching in PPL soon, I think he said next release. As to your second question, I'm a little confused how multi-tenancy will affect multiple index matching? I think the security plugin should handle any access restrictions for indices. Please feel free to elaborate.

@ryn9
Copy link

ryn9 commented Dec 1, 2021

@eugenesk24

As to your second question, I'm a little confused how multi-tenancy will affect multiple index matching?

I was trying to say that making specific index patterns required for plugin use can have impact on what is, probably, the most common way to handle multi-tenancy + RBAC - role based access to specific indices. So - I am hoping that static index names are not being considered like they were with the initial trace analytics plugin.

To note - plugins haven't always been thought with multi-tenancy + RBAC in mind - so I am hoping to find out more details about how the observability plugin is being designed with this in mind.

@anirudha
Copy link
Collaborator Author

anirudha commented Dec 6, 2021

Hi @ryn9 ,
Could you please elaborate on a use-case you are looking for in some details, for now here are the details.

  • Observability plugin support same tenancy structures as all of dashboards eg. link
  • RBAC is i.e roles based access control will follow the rules configured by your custer admin on the opensearch setup.
  • Currently we don't support index patterns for wildcard indices, this will be a supported in the next few releases. The wildcards will match any permission-ed index available to user for this roles.

The trace analytics plug-in required use of otel-v1-apm-span-* and otel-v1-apm-service-map*, which will not work in all environments, especially when multi-tenancy comes into play.

could you explain this with an example. From what i know access to these are only based on your roles and not tenancy.

@ryn9
Copy link

ryn9 commented Dec 7, 2021

could you explain this with an example. From what i know access to these are only based on your roles and not tenancy.

Sorry to confuse - 'tenancy' was probably the wrong word here.

The trace analytics plug-in required use of otel-v1-apm-span-* and otel-v1-apm-service-map*, which will not work in all environments. For example - we support different groups, whom (via role access) should only have access to certain data. IE.... we wouldn't want let userA and userB both have access to these index patterns, as we don't want to allow them access to each other's data.

@WassimDhib
Copy link

Hi

Same problem here : https://discuss.opendistrocommunity.dev/t/trace-analysis-and-multi-tenant/4602/2
Trace Analysis plugin require indices:data/read/search permission on all otel-v1-apm-span-* indices
so either you user have permissions for all indices, or nothing works

I think that supporting index patterns for wildcard indices is the solution.

@ryn9
Copy link

ryn9 commented Dec 20, 2021

@anirudha

Currently we don't support index patterns for wildcard indices, this will be a supported in the next few releases. The wildcards will match any permission-ed index available to user for this roles.

Does that mean the plugins would also be moving away from requiring specific index names?
Perhaps create different 'views' that, upon creation, could use different index patterns?

@ryn9
Copy link

ryn9 commented Jan 20, 2022

@anirudha @eugenesk24 - wondering if you folks have confirmed index pattern support for PPL for 1.3 yet.
It doesn't answer all of the questions above, but I believe is an important step for PPL adoption.

@eugenesk24
Copy link
Contributor

Sorry for the late reply @ryn9 ! We have added multiple index use through comma-separate indices and wildcard matching.

@ryn9
Copy link

ryn9 commented Feb 17, 2022

@eugenesk24 .. awesome. I am excited to see it :)

@anirudha anirudha added the v1.3.0 label Mar 9, 2022
@eugenesk24
Copy link
Contributor

Merged to main for 1.3 release

joshuali925 pushed a commit to joshuali925/observability that referenced this issue Jul 20, 2022
…ubmodules/3rdparty/libbpf-140b902

build(deps): bump 3rdparty/libbpf from `5b9d079` to `140b902`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v1.3.0
Projects
None yet
Development

No branches or pull requests

4 participants