Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Visualization Catalog #33

Closed
Swiddis opened this issue Jun 30, 2023 · 3 comments · Fixed by #34
Closed

[FEATURE] Visualization Catalog #33

Swiddis opened this issue Jun 30, 2023 · 3 comments · Fixed by #34
Labels
enhancement New feature or request untriaged

Comments

@Swiddis
Copy link
Collaborator

Swiddis commented Jun 30, 2023

Is your feature request related to a problem?

As we develop more integrations using standard SS4O, we will likely encounter a lot of overlap between visualization functionality. In particular, visualizations that may be commonly applicable in multiple contexts. Most integrations involving an http component might get value out of a "Status codes over time" visualization. An example of this is already present in the Nginx integration, which is being reduplicated for the upcoming Apache integration. If we don't have visualizations shared in one place, it can lead to a lot of reduplication of effort as functionality is reinvented.

What solution would you like?

The solution to this problem is to collect visualizations in one central location, which I'll call the "Visualization Catalog". This will organize visualizations within one location, so that integration developers can directly find tools that are useful for their own data, and have a better way to share their tools with others. The main problem with this is organization, and making it discoverable by developers. The catalog itself can just live as a directory on this repository, I don't think it needs to be implemented as a full API, as long as we provide an index and instructions for searching it.

For organizing these visualizations, the most important feature to know is what fields they need access to in order to run. In #31 (implemented via #32), we introduce a tool that is able to look at actual sample data records and use that to determine which fields are required. With some extra work, this tool might be extended to visualizations, to determine what components the visualization requires as well, and to connect the data and visualizations at one common schema in the middle. Adding some level of search functionality to the tool can be very helpful in giving useful results.

A basic implementation plan:

  1. Organize existing SS4O visualizations in a new folder for the visualization catalog. Optionally, structure them by component, to provide a minimal level of immediate searchability.
  2. Introduce documentation on developing visualizations, including what the major types are and how the different fields work.
  3. Extend the CLI tool from [FEATURE] Utilities CLI for developing integrations #31 to dissect a visualization to determine what types of data it can work with. This can be used to check individual visualizations.
  4. Enhance the CLI with search functionality, that lets users quickly find visualizations in the standard schema.
  5. Enhance the CLI with the ability to automatically compile a list of visualizations into a dashboard. This can be somewhat complex, but the basic principle is to generate the Saved Object file that can be imported into OpenSearch Dashboards.
  6. As a final step, we may also considering adding this functionality to the actual catalog API with more complex functionality, but I'm not sure if this would be productive.

These steps are not intended to all be done at once, I think just the first 3 steps will be enough to get started. But if we find more demand and the project is becoming unwieldy, then steps 4-6 can be natural extensions.

What alternatives have you considered?

  • We can continue to keep things unorganized, which is easy but does introduce reduplication of effort.
  • We can introduce minimal organization without tooling, which is hard for developers to use.

Do you have any additional context?

@Swiddis Swiddis added enhancement New feature or request untriaged labels Jun 30, 2023
@Swiddis
Copy link
Collaborator Author

Swiddis commented Jun 30, 2023

Some open questions:

  • What is the actual structure of visualizations? Is this documented? I'd like to not do everything by reverse-engineering code in the OpenSearch Dashboards project, but I can't find any proper specification.
  • Versioning and Updates: How will updates and versioning of visualizations be handled within the Visualization Catalog? Will there be a process in place to ensure that outdated or deprecated visualizations are removed or marked as such?
  • Integration Testing: How will the visualizations in the catalog be tested and validated to ensure their accuracy and compatibility with different data sources and scenarios?

@YANG-DB
Copy link
Member

YANG-DB commented Jun 30, 2023

@Swiddis u'r proposition make perfect sense, and they correlate to the idea behind the composition of the fields into dedicated functions:

Each file may have a corresponding visual representation in a form of a dashboard / visualization / views
These correspondence can be assembled by an integration builder into a meaningful composting that represent the component the integration is observing.

Currently each mapping fields set (component mapping) has a dictionary file which describes the fields and their relationships if such exist:

Using this perspective, the visualization folder is organized in the same (similar) way the schema folder is arranged to reflect this correspondence.

After this folder is created and each component is associated with its visual elements, it will be possible to construct the integration in the following steps:

  1. Identify the subject for which the integration resource is related with ( cloud / http / k8s / O/s java / windows ...)
  2. Find / Search all the relevant categories from the existing field set categories or labels.
  3. Compose the most appropriate mapping components from the previous step
  4. Assemble the visual corresponding elements of the former step
  5. Review - Validate - Test
  6. Publish

This will remove many obstacles for non UX experts and allow fast and efficient creation of integrations for any type of resources.

In the case the schema is incomplete - the process is much harder since the integration needs to actually define and create a new schema component which must be validated and consolidated with the following:

  • OTEL protocol
  • OpenSearch Schema support
  • ECS schema

@Swiddis
Copy link
Collaborator Author

Swiddis commented Jun 30, 2023

I agree with applying this when the visualization is restricted to one component, but I'm not sure it's so simple when the visualization references multiple components. A visualization correlating http and container may try to show something like which containers return the most internal server errors, so it's not as cleanly divided as the components themselves are I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request untriaged
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

2 participants