Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Vega Vis into VisBuilder Proposal #7067

Closed
3 tasks
Tracked by #6796
ananzh opened this issue Jun 19, 2024 · 10 comments
Closed
3 tasks
Tracked by #6796

Integrate Vega Vis into VisBuilder Proposal #7067

ananzh opened this issue Jun 19, 2024 · 10 comments
Assignees
Labels
proposal v2.16.0 visbuilder-next visbuilder with vega integration

Comments

@ananzh
Copy link
Member

ananzh commented Jun 19, 2024

Background

The primary problem we are addressing is the need for more advanced and customizable data visualization capabilities in OpenSearch Dashboards. While VisBuilder reached General Availability (GA) in version 2.15, it is currently limited to a few chart types and lacks the comprehensive set of controls necessary for complex visualizations. Enhancing VisBuilder to incorporate more complex controls will provide users with powerful tools for data analysis and reporting, thereby improving the overall user experience and functionality of OpenSearch Dashboards. Additionally, from a technical perspective, we aim to streamline the visualization process by consolidating the multiple existing libraries (such as timeline, vislib, and vega) into a single, cohesive library. This unification will simplify the development and maintenance of visualizations, ensuring consistency and ease of use for developers and users alike.

Requirements and Considerations

Requirements

Technical Requirements:

  • Integrate Vega and Vega-Lite visualizations into OpenSearch Dashboards, supporting a wide range of aggregations, like metric and bucket aggregations as defined in METRIC_TYPES and BUCKET_TYPES in AggConfigs for OpenSearch data.
  • Allow users to create and customize visualizations using the Vega editor.
  • Ensure seamless migration from existing visualization tools to Vega in VisBuilder.

Non-Technical Requirements:

  • Enhance user experience by providing a more flexible and powerful visualization tool through adding more controls.
  • Ensure that the new tool is intuitive and easy to use for users transitioning from older tools.

Considerations and Optimizations

Optimizations:

  • Flexibility and Customization: Optimize for the ability to create highly customized visualizations.
  • User Experience: Ensure that the tool is easy to use and integrates smoothly with existing workflows.
  • Extensibility: Design the solution to be easily extensible for future enhancements and new features.
  • Reuse Existing Components: Leverage existing components like expression, embeddable, and VegaVisEditor to lower integration risks and ensure compatibility with other parts of the system.
  • Performance: Maintain the performance of the current VisBuilder when using Vega visualizations.

Non-Prioritized Aspects:

  • Latency: While performance is important, we do not prioritize ultra-low latency over customization and flexibility.
  • Redundancy: Focus is on functionality and user experience rather than on high redundancy.

Out of Scope

  • Backend Data Processing Enhancements: This design does not cover improvements or changes to the backend data processing capabilities. The focus is strictly on the visualization layer.
  • New Data Sources Integration: Integrating new data sources is currently outside the scope; the design assumes existing data sources are sufficient.
  • Non-Vega Visualizations: Enhancements or changes to non-Vega visualization tools are not covered, as the focus is on integrating Vega.
  • A new vega type vis directly in VisBuilder: This is implementable. But it is not clear what is the benefit to integrate the entire vega vis into VisBuilder.

Current Workflow

VisLib in VisBuilder Workflow

Vega Vis Workflow

  • Spec Parsing: The Vega spec JSON file is parsed and validated. This ensures the spec adheres to the Vega or Vega-Lite schema.
  • Data Retrieval: If the spec includes OpenSearch queries (usually in the data.url section), these queries are executed against the OpenSearch cluster. The results (raw response) are fetched and prepared for use in the visualization.
  • Context Integration: OpenSearch Dashboards-specific context (like index pattern, time range filters, dashboard filters) is applied to the spec. Special placeholders like %timefield%, %context%, etc., are replaced with actual values.
  • Data Transformation: Any data transformations specified in the Vega spec are applied to the retrieved data. This might include operations like filtering, aggregating, or calculating new fields.
  • Vega Runtime Compilation: The parsed spec is compiled into a runtime representation that Vega can execute.
    This compilation process resolves data sources, scales, and other components defined in the spec.
  • Rendering: The Vega runtime executes the compiled spec.This generates the actual Canvas elements.
  • Integration with OpenSearch Dashboards: The rendered visualization is integrated into the OpenSearch Dashboards interface. This includes handling interactions like zooming, panning, and tooltips.

Proposed Design

Key Deliveries for 2.16

Note: This is not a complete version. It is just for demo purpose.

2024-06-18_16-13-08.mp4

1. Vega Integration in VisBuilder

  • Extend the existing visualization slice to include Vega-specific state and actions.
  • Implement a set of reusable utility functions that generate Vega specifications:
    • buildDataUrl: Constructs the data URL for OpenSearch queries
    • parseAggStructure: Parses the aggregation structure for easier transformation
    • generateTransform: Creates Vega transform steps based on the aggregation structure
    • buildEncoding: Generates encoding specifications for visual properties
    • buildVegaSpec: Assembles the complete Vega specification
  • Support complex bucket aggregations through dynamic transformation of OpenSearch aggregations to Vega-compatible format.
  • Implement actions to update Vega state based on user interactions (e.g., setVegaTooltip, setVegaAggs, setVegaTransforms, setVegaEncoding).
  • Modify the toExpression method to use Vega rendering when enabled.
  • Ensure seamless integration with existing OpenSearch Dashboards components and workflows.

2. Advanced setting to allow user to use vega to create visualizations in VisBuilder

This includes modifications in VisBuilder for each chart type to use either visualization expression or vega expression. The main purpose is to avoid any breaks for user experience. New controls will only be added in vega vis.

Screenshot 2024-06-15 at 4 42 13 PM

3.Easy migration from VisLib visualization created by VB to vega vis. Allow embed both visualizations in Dashboard .

Allow save vislib vis or vega vis: the only difference in the url is useVegaRendering value in style state which will decide whether use visualization expression or vega expression. when useVegaRendering is true, render vega in VisBuilder with toggle turned on.

/vis-builder/edit/471fa110-2ba8-11ef-b457-4707dd1c36d9#?
_q=(filters:!(),query:(language:kuery,query:''))&
_a=(metadata:(editor:(errors:(),state:loading)),
style:(addLegend:!t,addTooltip:!t,legendPosition:right,type:area,useVegaRendering:!f), // different part
ui:(),visualization:(activeVisualization:(aggConfigParams:!(),name:area),
indexPattern:ff959d40-b880-11e8-a6d9-e546fe2bba5f,searchField:''))&
_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15d,to:now))

Same as embedded to Dashboard: when saved with useVegaRendering to true, embed vega vis in Dashboard
Screenshot 2024-06-17 at 10 18 13 AM

4.More controls to line chart.(Optional) Use line chart as an example to integrate all the controls from line visualization to Vis-Builder line vega chart. Optional: add 1-2 new controls

Implementation Details regarding the VegaSpecBuilder Class

Method 1: Passing the Whole Aggregation (Aggs) as Input

Key Differences from Static Vega Spec Input:

  • Dynamic State Management: Utilize the visualization slice in Redux to manage Vega-specific state, enabling real-time updates based on user interactions.
  • Context Integration: Incorporate OpenSearch Dashboards context (e.g., index patterns, time ranges) directly into the Vega specification through the visualization state.
  • Data Retrieval: Leverage the modified opensearchaggs function to obtain aggregations directly, ensuring consistency with OpenSearch Dashboards' data model.
  • Data Transformation: Implement custom utility functions to dynamically generate Vega transforms based on the aggregation structure, providing more flexibility than static transforms.
  • Spec Generation: Dynamically construct the entire Vega specification using utility functions, allowing for on-the-fly adjustments based on user inputs and data changes.
  • Integration with Existing Tools: Seamlessly switch between traditional VisBuilder rendering and Vega rendering, maintaining compatibility with existing visualizations.

1. Extend the Visualization Slice for context integration

We'll extend the existing visualization slice to include Vega-specific state and actions:

import { createSlice, PayloadAction } from '@reduxjs/toolkit';
import { CreateAggConfigParams } from '../../../../../data/common';
import { VisBuilderServices } from '../../../types';
import { setActiveVisualization } from './shared_actions';

export interface VegaState {
  dataUrl: any;
  transforms: any[];
  encoding: any;
  aggs: any;
  indexPattern: string | null;
  metrics: any[];
  buckets: any[];
  tooltip: any;
  timeField: string;
  split: any[];
  group: any[];
  segment: any[];
  type: string;
  useVegaRendering: boolean;
}

export interface VisualizationState {
  indexPattern?: string;
  searchField: string;
  activeVisualization?: {
    name: string;
    aggConfigParams: CreateAggConfigParams[];
    draftAgg?: CreateAggConfigParams;
  };
  vega: VegaState;
}

// ... (keep existing initial state and preloaded state logic)

export const slice = createSlice({
  name: 'visualization',
  initialState,
  reducers: {
    // ... (keep existing reducers)

    // Add Vega-specific reducers
    setVegaTooltip: (state, action: PayloadAction<any>) => {
       state.vega.tooltip = action.payload;
    },
    setVegaAggs: (state, action: PayloadAction<any>) => {
       state.vega.aggs = action.payload;
    },
    setVegaTransforms: (state, action: PayloadAction<any[]>) => {
       state.vega.transforms = action.payload;
    },
    setVegaEncoding: (state, action: PayloadAction<any>) => {
       state.vega.encoding = action.payload;
    },
    // Add more Vega-specific reducers as needed
  },
  // ... (keep existing extra reducers)
});

// Export actions
export const {
  // ... (keep existing action exports)
  setVegaTooltip,
  setVegaAggs,
  setVegaTransforms,
  setVegaEncoding,
} = slice.actions;

2. Data Retrieval with proper aggregations: Utilize opensearchaggs to retrive aggs directly

Update the opensearchaggs function to return the constructed aggregations:

export const modifiedOpensearchaggs = () => ({
  // ... (keep existing properties)

  async fn(input, args, { inspectorAdapters, abortSignal }) {
    // ... (keep existing logic)

    // Return the constructed aggs using toDsl
    const constructedAggs = aggs.toDsl(args.metricsAtAllLevels);

    return constructedAggs;
  }
});

3. Data Transformation

Data transform in vega is done by transform. What it does is similar to tabifyAggResponse, which is aim to flatten nested structures for visualization. The main difference in approach is that tabifyAggResponse creates a complete tabular representation of the data, while the Vega transform provides a series of steps to transform the data on-the-fly during visualization rendering. This makes the Vega approach more memory-efficient and potentially faster for large datasets, as it doesn't need to materialize the entire flattened dataset in memory. Here is more comparation:

  • Output format: tabifyAggResponse produces a tabular format with rows and columns, while the Vega transform creates a series of steps to transform the data within Vega.
  • Naming conventions: tabifyAggResponse uses numeric IDs (e.g., 2-1) for column names, while the Vega transform uses more descriptive names based on the aggregation structure.
  • Handling of buckets: tabifyAggResponse creates separate rows for each bucket combination, while the Vega transform uses flatten operations to handle nested buckets.

Here we will add two utility functions

  • parseAggStructure: This function can recursively parses the aggregation structure to create a simplified representation.
  • generateTransform: This function generates the Vega transform steps based on the parsed aggregation
function parseAggStructure(aggs: any, path: string[] = []): any {
  const result: any = {};
  
  for (const [key, value] of Object.entries(aggs)) {
    if (key === 'buckets' && Array.isArray(value)) {
      result.buckets = value[0]; // Take the first bucket as a sample
    } else if (typeof value === 'object' && value !== null) {
      result[key] = parseAggStructure(value, [...path, key]);
    } else {
      result[key] = value;
    }
  }
  
  return result;
}

function generateTransform(aggStructure: any, aggs: any): any[] {
  const transform: any[] = [];
  const flattenStack: string[] = [];

  function getFieldName(aggId: string): string {
    return aggs[aggId]?.terms?.field || aggs[aggId]?.date_histogram?.field || aggId;
  }

  function traverse(obj: any, path: string[] = [], depth: number = 0) {
    for (const [key, value] of Object.entries(obj)) {
      if (key === 'buckets') {
        const parentKey = path[path.length - 1];
        const fieldName = getFieldName(parentKey);
        
        if (parentKey !== '2') { // Skip the top-level bucket agg
          transform.push({ calculate: `datum['${parentKey}']['buckets']`, as: fieldName });
          transform.push({ flatten: [fieldName] });
          flattenStack.push(fieldName);
        
          // Add key calculation
          transform.push({ calculate: `datum['${fieldName}'].key`, as: fieldName });
        } else {
          // For bucket agg, use 'key' directly
          transform.push({ calculate: "datum['key']", as: fieldName });
        }
        
        traverse(value, [...path, key], depth + 1);
        
        if (parentKey !== '2') {
          flattenStack.pop();
        }
      } else if (typeof value === 'object' && value !== null) {
        traverse(value, [...path, key], depth + 1);
      } else if (key === 'value' && depth === Object.keys(aggStructure).length - 1) {
        // Only add metric calculation for the deepest level
        const metricKey = path[path.length - 2];
        const fieldName = aggs[metricKey]?.avg?.field || `${metricKey}_value`;
        const parent = flattenStack[flattenStack.length - 1] || 'datum';
        transform.push({ calculate: `${parent}['${metricKey}']['value']`, as: `avg_${fieldName}` });
      }
    }
  }

  traverse(aggStructure);
  return transform;
}

Use these functions in the Vega utility functions in the next sub-section:

const buildTransforms = (aggs: any) => {
  const aggStructure = parseAggStructure(aggs);
  return generateTransform(aggStructure);
};

Example Result: Given the following aggregation:

"aggs": {
  "2": {
    "date_histogram": {
      "field": "timestamp",
      "fixed_interval": "12h",
      "time_zone": "America/Los_Angeles",
      "min_doc_count": 1
    },
    "aggs": {
      "3": {
        "terms": {
          "field": "geo.dest",
          "order": { "_count": "desc" },
          "size": 5
        },
        "aggs": {
          "1": {
            "avg": { "field": "bytes" }
          }
        }
      }
    }
  }
}

The datum structure would be:

{
  "3": {
    "buckets": [
      {
        "1": { "value": 5069.333333333333 },
        "key": "CN",
        "doc_count": 3
      },
      // ... other buckets
    ]
  },
  "key_as_string": "2024-06-30T12:00:00.000-07:00",
  "key": 1719774000000,
  "doc_count": 23
}

The generated transform would be:

[
  {
    "calculate": "datum['key']",
    "as": "timestamp"
  },
  {
    "calculate": "datum['3']['buckets']",
    "as": "geo.dest"
  },
  {
    "flatten": ["geo.dest"]
  },
  {
    "calculate": "datum['geo.dest'].key",
    "as": "geo.dest"
  },
  {
    "calculate": "datum['geo.dest']['1']['value']",
    "as": "avg_bytes"
  }
]

4. Create Vega Utility Functions

Create utility functions in a separate file:

// vegaUtils.ts

export const buildDataUrl = (indexPattern: string, timeField: string, aggs: any) => {
  return {
    context: true,
    timefield: timeField,
    index: indexPattern,
    body: {
      aggs: aggs,
      size: 0,
    },
  };
};

export const buildTransforms = (metrics: any[], buckets: any[]) => {
  // Implementation of buildTransforms logic
};

export const buildEncoding = (metrics: any[], buckets: any[], fieldsMap: any) => {
  // Implementation of buildEncoding logic
};

export const buildVegaSpec = (state: VisualizationState) => {
  const { vega } = state;
  const dataUrl = buildDataUrl(vega.specBuilder.indexPattern!, vega.specBuilder.timeField, vega.specBuilder.aggs);
  const transforms = buildTransforms(vega.specBuilder.metrics, vega.specBuilder.buckets);
  const encoding = buildEncoding(vega.specBuilder.metrics, vega.specBuilder.buckets, vega.specBuilder.fieldsMap);

  return {
    $schema: "https://vega.github.io/schema/vega-lite/v5.json",
    data: { url: dataUrl },
    transform: transforms,
    mark: { type: vega.specBuilder.type, point: true },
    encoding: encoding,
  };
};

5. Update toExpression Method

Modify the toExpression method to use the new utility functions:

const toExpression = async (params) => {
  const state = store.getState().visualization;
  if (state.vega.useVegaRendering) {
    const vegaSpec = buildVegaSpec(state);
    let vis = await createVis('vega', state.activeVisualization!.aggConfigParams, state.indexPattern!, params.searchContext);
    vis.params = {
      spec: JSON.stringify(vegaSpec),
    };

    const vega_expression = await buildPipeline(vis, {
      timefilter: params.timefilter,
      timeRange: params.timeRange,
      abortSignal: undefined,
      visLayers: undefined,
      visAugmenterConfig: undefined,
    });
    return vega_expression;
  }
  // ... (existing non-Vega rendering logic)
};

Method 2: Construct Aggs

Method 2 follows a similar structure to Method 1, but instead of passing the whole aggregation, it constructs the aggregation from individual components (metrics, segment, group, split). The main difference lies in the setVegaAggs reducer and the buildVegaSpec utility function:

// In the visualization slice
setVegaAggs: (state, action: PayloadAction<{metrics: any[], segment: any[], group: any[], split: any[]}>) => {
  const { metrics, segment, group, split } = action.payload;
  state.vega.specBuilder.metrics = metrics;
  state.vega.specBuilder.segment = segment;
  state.vega.specBuilder.group = group;
  state.vega.specBuilder.split = split;
  // Construct aggs from these components
  state.vega.specBuilder.aggs = constructAggs(metrics, segment, group, split);
},

// In vegaUtils.ts
export const constructAggs = (metrics: any[], segment: any[], group: any[], split: any[]) => {
  // Logic to construct aggs from individual components
};

Method 3: Passing Formatted Data to Vega Spec

This method involves passing pre-formatted data directly to the Vega spec. This method requires modifications to the buildVegaSpec function:

// In vegaUtils.ts
export const buildVegaSpec = (state: VisualizationState, formattedData: any[]) => {
  return {
    $schema: "https://vega.github.io/schema/vega-lite/v5.json",
    data: { values: formattedData },
    // ... other spec properties
  };
};

// In the component where the Vega spec is created
const formattedData = await getFormattedDataFromOpensearchaggs(/* params */);
const vegaSpec = buildVegaSpec(state, formattedData);

3. Pros and Cons

Method 1: Passing Whole Aggregation

  • Pros:
    • Maintains consistency with existing aggregation structures
    • Efficient for complex aggregations
  • Cons:
    • Less flexibility for custom aggregations

Method 2: Construct Aggs

  • Pros:
    • Offers more flexibility in aggregation construction
    • Allows for fine-grained control over each aggregation component
  • Cons:
    • More complex to implement and maintain
    • Potential for inconsistencies if not carefully managed

Method 3: Passing Formatted Data

  • Pros:
    • Simplifies the Vega spec
    • Allows for pre-processing and custom data formatting
  • Cons:Cons:
    • Potential memory issues with large datasets
    • Less efficient for frequently updating data
    • May not scale well for very large datasets

Conclusion

After considering all three methods, we decide proceeding with Method 1: Passing Whole Aggregation. This approach offers the best balance between maintaining consistency with existing OpenSearch Dashboards structures and providing efficient handling of complex aggregations. It avoids the potential scalability and performance issues of Method 3 while being less complex to implement and maintain than Method 2. Method 1 aligns well with the current OpenSearch Dashboards architecture and will likely provide the smoothest integration path for Vega visualizations within the existing framework. It also leaves room for future optimizations and extensions if needed.

How to Test / How to Make the Transfer Robust

To ensure the robustness and accuracy of the VegaSpecBuilder implementation, we should create a series of test cases that cover various combinations of metrics and buckets. These test cases will help verify that the VegaSpecBuilder can correctly handle different visualization configurations.

Test Cases

  • 1 Metric 1 Bucket:

    • 1 Metric + 1 Segment: Verify that the VegaSpecBuilder correctly sets the x-axis to the segment field and the y-axis to the metric field.
    • 1 Metric + 1 Split: Ensure that the VegaSpecBuilder creates separate charts for each split value.
    • 1 Metric + 1 Group: Check that the VegaSpecBuilder generates separate lines (or other marks) for each group value within a single chart.
  • 2 Metrics 1 Bucket:

    • 2 Metrics + 1 Segment: Confirm that the VegaSpecBuilder supports multiple y-axes for the different metrics while using the segment field for the x-axis.
    • 2 Metrics + 1 Split: Ensure that separate charts are created for each split value, with each chart containing the two metrics.
    • 2 Metrics + 1 Group: Verify that the VegaSpecBuilder generates separate lines (or other marks) for each group value within a single chart, displaying both metrics.
  • 1 Metric 3 Buckets:

    • 1 Metric + 1 Split + 1 Group + 1 Segment: Test that the VegaSpecBuilder correctly uses the segment field for the x-axis, creates separate lines (or other marks) for each group value, and generates separate charts for each split value.
  • 1 Metric 4 Buckets:

    • 1 Metric + 1 Split + 2 Group + 1 Segment
  • 2 Metrics 4 Buckets:

    • 2 Metrics + 1 Split + 2 Group + 1 Segment

Future Extension Discussion

Supporting Multiple Query Languages (DQL, PPL, SQL)

Extend the VegaSpecBuilder to handle different query languages:

buildPPlQuery() {
   this.pplQuery = ...
}

buildPPLQuerySpec(pplQuery) {
  return {
    data: {
      url: {
        index: this.indexPattern.title,
        body: {
          query: {
            source: {
              query: this.pplQuery,
            },
          },
          size: 0,
        },
      },
      format: this.format
    },
  };
}

buildSQLQuerySpec(sqlQuery) {
  ...
}

buildWithQuerySpec(queryType = 'dql', query = '') {
  let dataUrl;
  if (queryType === 'dql') {
    dataUrl = this.buildDataUrl();
  } else if (queryType === 'ppl') {
    dataUrl = this.buildPPLQuerySpec(query);
  } else if (queryType === 'sql') {
    dataUrl = this.buildSQLQuerySpec(query);
  }
  return build(this.data)
}

Handling Multiple Queries and Data Sources

Handle multiple queries and data sources by extending the buildVegaSpec method:

buildMultiQuerySpec(queries) {
  this.dataWithMultipleQuery = this.queries.map((query, index) => ({
    name: `data${index + 1}`,
    url: {
      index: this.indexPattern.title,
      body: {
        query: query.format === 'ppl' ? {
          source: {
            query: this.buildPPLQuery(),
          },
        } : {
          sql: {
            query: this.buildSQLQuery(),
          },
        },
        size: 0,
      },
    },
    format: this.format
  }));

  return build(this.dataWithMultipleQuery)

2.16 Timeline and Task BreakDowns

  • Integrate vega vis in VisBuilder
  • Convert existing vis charts to vega in VisBuilder
  • Add more controls for line chart in VisBuilder

FAQ

@ananzh ananzh self-assigned this Jun 19, 2024
@ananzh ananzh changed the title Integrate Vega Vis into VisBuilder Proposal [Draft] Integrate Vega Vis into VisBuilder Proposal Jun 19, 2024
@YANG-DB
Copy link
Member

YANG-DB commented Jun 19, 2024

@ananzh very nice !
I would add another important capability is to allow the community to contribute generic vis-tool as part of the out of the box vis tools catalog

@YANG-DB
Copy link
Member

YANG-DB commented Jun 19, 2024

I strongly recommend reviewing the vega-altair engine used to do this same transformation from a high level language (python) into the vega spec (json)

@YANG-DB
Copy link
Member

YANG-DB commented Jun 19, 2024

Another suggestion is to integration the existing opensource vega-editor to replace our existing vega json editor to simplify the actual vega editing for advanced vis- builders

@ashwin-pc
Copy link
Member

ashwin-pc commented Jun 20, 2024

zooming in and out

This exists in the tool today.

Toggle in VisBuilder to allow user to display vislib vis or vega vis in VisBuilder, to save as vislib vis or vega vis and to embed either vislib vis or vega vis in Dashboard .

We should not have a toggle in the UI since for most users Vega is an implementation detail. Only advanced users would care about it. If we want to maintain the expereince for users, we should either try to match the experience or keep an advanced settings toggle to allow the user to go back to the older expereince.

A new vega type vis directly in VisBuilder

Why do we need this as opposed to just redirecting the user to the vega editor? if we do it this way, we should allow the user to switch back and carry context from vega back to the other chart types. Right now if i switch between line and bar and go back to line, the line chart carries over the changes that t can from the bar chart. With this vega type can we do that?

VegaSpecBuilder Class

In this class you are also constructing the query but its very secific to DSL. how would this work with PPL and SQL? They each support a limited subset aggregations and does not support all the agg types.

Supporting Multiple Query Languages (DQL, PPL, SQL)

if we arent integrating VisBuilder into Discover, we might not need this. Would like to hear from the others about this, but my reasoning is that the user never has to enter the query that is used to fetch the data from the backend. If thats the case, the language we use under the hood does not matter. The only exception to this being datasources that dont support visualizations in other languages. In that scenario id like this to be a little more modular so that when other languages are added, its not on the VisType to manually update itself to support all the new languages.

One approach here could be to allow the VisType to specify which languages it supports so that they all have to support DQL by default but can optionally specify which other languages they support. But what would be even nicer is if the VisType did not have to know anything about the language used under the hood and only worried about the dataframe that cameback and mapped it to the Vis, leaving the query language part to the framework. But this might be trickier

@virajsanghvi
Copy link
Collaborator

  • Is the problem to solve reflected in the requirements? If so, why is this the best way to solve this?
  • What is expectations of migrated visuals? Should they look exactly like they did pre-vega?
  • VegaSpecBuilder - a little unclear on how this fits in at a high level
  • Do we want a toggle for Vega Light vs just have everything render that way?
  • How do you get to Vega Vis type from other visualizations?
  • What features do we want to add to line chart? Can we be specific? - is this specific to vega integration?
  • Are there more charts than just Pie? - Should visual types be part of this if its specific to vega integration?
  • "1.Minimum Changed Customer Experience" - what is changing?
  • Is VegaSpecBuilder the state of the configuration? Or does it just operate one way (config -> vega spec)?
  • Should the builder be in state or the vega spec? The builder pattern seems to mutate state vs rely on building a new spec.
  • Why be able to set the state explicitly and set particular options? Why not take one approach or the other?

@virajsanghvi
Copy link
Collaborator

  • VegaSpecBuilder - for building queries - should different languages contribute definition on how to build the query? Do we want these centrally located.
  • Can you summarize what the alternative proposal was in comparison to the specbuilder?

@ananzh
Copy link
Member Author

ananzh commented Jun 20, 2024

A hard code mapping for demo purpose

export const createVegaSpec = (styleState, dimensions, valueAxes, aggConfigs, indexPattern, searchContext) => {
  const { addLegend, addTooltip, type } = styleState;
  const { x, y } = dimensions;
  const index = indexPattern.title;
  const timeField = searchContext.timeRange ? searchContext.timeRange.field : "@timestamp"; // Use the time range field or default to "@timestamp"

  const dateHistogram = aggConfigs.aggs.find(agg => agg.schema === 'segment');
  const metric = aggConfigs.aggs.find(agg => agg.schema === 'metric');
  const metricType = metric.type.name;

  const dataUrl = {
    context: true,
    timefield: timeField,
    index: index,
    body: {
      aggs: {
        1: {
          date_histogram: {
            field: dateHistogram.params.field.displayName,
            fixed_interval: "3h", // hard coded for now
            time_zone: "America/Los_Angeles", // can be dynamic if required
            min_doc_count: dateHistogram.params.min_doc_count,
            extended_bounds: dateHistogram.params.extended_bounds,
          },
          aggs: {
            2: {
              [metricType]: {
                field: metric.params.field.displayName
              }
            }
          }
        }
      },
      size: 0
    }
  };

  const vegaSpec = {
    $schema: "https://vega.github.io/schema/vega-lite/v5.json",
    data: {
      url: dataUrl,
      format: {
        property: "aggregations.1.buckets"
      }
    },
    transform: [
      {
        calculate: "datum.key",
        as: "timestamp"
      },
      {
        calculate: `datum[2].value`,
        as: metric.params.field.displayName
      }
    ],
    layer: [
      {
        mark: {
          type: "line" // or dynamic type if needed
        }
      },
      {
        mark: {
          type: "circle",
          tooltip: addTooltip
        }
      }
    ],
    encoding: {
      x: {
        field: "timestamp",
        type: "temporal",
        axis: {
          title: timeField
        }
      },
      y: {
        field: metric.params.field.displayName,
        type: "quantitative",
        axis: {
          title: metric.params.field.displayName
        }
      },
      color: {
        datum: metric.params.field.displayName,
        type: "nominal"
      }
    }
  };

  if (addLegend) {
    vegaSpec.encoding.color.legend = {
      title: metric.params.field.displayName
    };
  }

  return vegaSpec;
};

@virajsanghvi
Copy link
Collaborator

virajsanghvi commented Jun 26, 2024

Can you speak to the difference of the options? I'm not really sure from reading

From method 1: cons

which might not be flexible for dynamic changes.

Are there specific cases you're worried about?

we should create a series of test cases that cover various combinations of metrics and buckets

Just to be clear, we should have test cases for all known combinations, right? And can we prevent unknown combos from being used in the product in some way?

Also, do we clearly understand the expected input/output of these cases?

VegaSpecBuilder

Should we be storing unserializable state in redux?

Also, building the spec is calculated state, is this the right thing to store?

@ashwin-pc
Copy link
Member

Create a vega slice

Why do we need a slice? slices are for state that needs to be stored globally and accessed across the app. The Vega spec is only needed by the Visualization right? cant we just create the spec there?

Send modular API to update VegaBuilder Class

Do we need to update both the slice and the aggconfig? or can we update just the aggconfig? My assumption was that the spec could be constructed whenever we want using the style state and the agg config.

Separate buckets
Both methods need to separate bucket aggregations into distinct categories: group, split, and segment. This separation is necessary because each type of aggregation serves a different purpose in the visualization:

Can you give a little more details about this. Not sure i fully understood why we need this.

VegaSpecBuilder

How does this work for different Vistypes? dont the encodings and specs change between vistypes? e.g. pie and bar chart will encode the chart differently right?

const vegaSpecBuilder = useTypedSelector(state => state.vega.specBuilder);

State should not be used to retrieve a function. Why cant vegaSpecBuilder be a simple function?

The Difference

In this section i didnt understand the difference between the two methods. What is method 2? I didnt understand the pro's and cons of each approach to know which ones better. An example might help.'

Overall, the approach here could benifit from a block diagram explaining how the flow works as the information is passed across the various components

@ananzh ananzh changed the title [Draft] Integrate Vega Vis into VisBuilder Proposal Integrate Vega Vis into VisBuilder Proposal Jul 2, 2024
@anirudha
Copy link

anirudha commented Jul 4, 2024

| if we arent integrating VisBuilder into Discover

How will sql/ ppl users build visualization?

How will discover IA for visualizations be handled with multiple languages support ?

How will we achieve the cohesion tenet without sql / ppl support for visualizations

ananzh added a commit to ananzh/OpenSearch-Dashboards that referenced this issue Jul 19, 2024
In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
opensearch-project#7067

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
ananzh added a commit to ananzh/OpenSearch-Dashboards that referenced this issue Jul 19, 2024
In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
opensearch-project#7067

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
ananzh added a commit to ananzh/OpenSearch-Dashboards that referenced this issue Jul 19, 2024
In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
opensearch-project#7067

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
opensearch-trigger-bot bot pushed a commit that referenced this issue Jul 23, 2024
* [VisBuilder] Add Capability to generate dynamic vega

In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
#7067

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* fix PR comments

* update file and functions names
* fix type errors
* fix area chart

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* add unit tests

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* enable embeddable for useVega

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* remove buildVegaScales due to split it to smaller modules

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* fix date for vega

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* fix test

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* Changeset file for PR #7288 created/updated

---------

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
(cherry picked from commit faaa45c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this issue Jul 23, 2024
* [VisBuilder] Add Capability to generate dynamic vega

In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
#7067

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* fix PR comments

* update file and functions names
* fix type errors
* fix area chart

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* add unit tests

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* enable embeddable for useVega

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* remove buildVegaScales due to split it to smaller modules

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* fix date for vega

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* fix test

Signed-off-by: Anan Zhuang <ananzh@amazon.com>

* Changeset file for PR #7288 created/updated

---------

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
(cherry picked from commit faaa45c)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
ananzh pushed a commit that referenced this issue Jul 23, 2024
* [VisBuilder] Add Capability to generate dynamic vega

In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
#7067


---------



(cherry picked from commit faaa45c)

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
ananzh pushed a commit that referenced this issue Jul 23, 2024
* [VisBuilder] Add Capability to generate dynamic vega

In this PR, we add the capability for Visbuilder to generate dynamic Vega and Vega-Lite
specifications based on user settings and aggregation configurations.

* developed functions buildVegaSpecViaVega and buildVegaSpecViaVegaLite
that can create either Vega or Vega-Lite specifications depending on the complexity
of the visualization.
* added VegaSpec and VegaLiteSpec interfaces to provide better type checking
* broken down the specification building into smaller, reusable components
(like buildEncoding, buildMark, buildLegend, buildTooltip) to make the code
more maintainable and easier to extend.
* added flattenDataHandler to prepare and transform data for use in Vega visualizations

Issue Resolve
#7067


---------



(cherry picked from commit faaa45c)

Signed-off-by: Anan Zhuang <ananzh@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: opensearch-changeset-bot[bot] <154024398+opensearch-changeset-bot[bot]@users.noreply.github.com>
@ananzh ananzh added visbuilder-next visbuilder with vega integration and removed vis builder vega labels Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal v2.16.0 visbuilder-next visbuilder with vega integration
Projects
None yet
Development

No branches or pull requests

5 participants