Skip to content

Grafana Databricks integration allowing direct connection to Databricks to query and visualize Databricks data in Grafana.

License

Notifications You must be signed in to change notification settings

sc-juho/databricks-grafana

 
 

Repository files navigation

Databricks - Grafana Data Source Backend Plugin

Release workflow

Grafana Databricks integration allowing direct connection to Databricks to query and visualize Databricks data in Grafana.

img.png

Get started with the plugin

Set up the Databricks Data Source

Requirements

  • Grafana Version >= 9.1.0

If you are using an earlier Grafana version try the v1.1.7 release of this plugin, which is the latest release supporting Grafana > 7.0

Install the Data Source

  1. Install the plugin into the grafana plugin folder:
grafana-cli --pluginUrl https://github.com/mullerpeter/databricks-grafana/releases/latest/download/mullerpeter-databricks-datasource.zip plugins install mullerpeter-databricks-datasource

or

cd /var/lib/grafana/plugins/
wget https://github.com/mullerpeter/databricks-grafana/releases/latest/download/mullerpeter-databricks-datasource.zip
unzip mullerpeter-databricks-datasource.zip
  1. Edit the grafana configuration file to allow unsigned plugins:
  • Linux:/etc/grafana/grafana.ini
  • macOS:/usr/local/etc/grafana/grafana.ini
[plugins]
allow_loading_unsigned_plugins = mullerpeter-databricks-datasource

Or with docker

docker run -d \
-p 3000:3000 \
-v "$(pwd)"/grafana-plugins:/var/lib/grafana/plugins \
--name=grafana \
-e "GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=mullerpeter-databricks-datasource" \
grafana/grafana
  1. Restart grafana

Configure the Datasource

  • Open the side menu by clicking the Grafana icon in the top header.
  • In the side menu under the Configuration icon you should find a link named Data Sources.
  • Click the + Add data source button in the top header.
  • Select Databricks.

To configure the plugin use the values provided under JDBC/ODBC in the advanced options of the Databricks Cluster (or SQL Warehouse) and create a personal access token for Databricks.

img_1.png

Available configuration fields are as follows:

Name Description
Server Hostname Databricks Server Hostname (without http). i.e. XXX.cloud.databricks.com
Server Port Databricks Server Port (default 443)
HTTP Path HTTP Path value for the existing cluster or SQL warehouse. i.e. sql/1.0/endpoints/XXX
Access Token Personal Access Token for Databricks.
Code Auto Completion If enabled the SQL editor will fetch catalogs/schemas/tables/columns from Databricks to provide suggestions.

Supported Macros

All variables used in the SQL query get replaced by their respective values. See Grafana documentation for Global Variables.

Additionally the following Macros can be used within a query to simplify syntax and allow for dynamic parts.

Macro example Description
$__timeFilter(time_column) Will be replaced by an expression to filter on the selected timerange. i.e. time_column BETWEEN '2021-12-31 23:00:00' AND '2022-01-01 22:59:59'
$__timeWindow(time_column) Will be replaced by an expression to group by the selected interval. i.e. window(time_column, '2 HOURS')
$__timeFrom Will be replaced by the start of the selected timerange. i.e. '2021-12-31 23:00:00'
$__timeTo Will be replaced by the end of the selected timerange. i.e. '2022-01-01 22:59:59'

Write a query

Use the query editor to write a query, you can use sparksql syntax according to the Databricks SQL Reference.

Long to Wide Transformation

By default, the plugin will return the results in wide format. This behavior can be changed in the advanced options of the query editor.

img.png

Code Auto Completion

Auto Completion for the code editor is still in development. Basic functionality is implemented, but might not always work perfectly. When enabled, the editor will make requests to Databricks while typing to get the available catalogs, schemas, tables and columns. Only the tables present in the current query will be fetched. Additionally, the editor will also make suggestions for Databricks SQL functions & keywords and Grafana macros.

The feature can be enabled in the Datasource Settings.

img.png

img.png

Examples

Single Value Time Series

SELECT $__time(time_column), avg(value_column)
FROM catalog.default.table_name 
WHERE $__timeFilter(time_column) 
GROUP BY $__timeWindow(time_column);

Multiple Values Time Series

SELECT window.start, avg(o_totalprice), o_orderstatus
FROM samples.tpch.orders
WHERE $__timeFilter(o_orderdate)
GROUP BY $__timeWindow(o_orderdate), o_orderstatus
ORDER BY start ASC;

Development

What is Grafana Data Source Backend Plugin?

Grafana supports a wide range of data sources, including Prometheus, MySQL, and even Datadog. There’s a good chance you can already visualize metrics from the systems you have set up. In some cases, though, you already have an in-house metrics solution that you’d like to add to your Grafana dashboards. Grafana Data Source Plugins enables integrating such solutions with Grafana.

For more information about backend plugins, refer to the documentation on Backend plugins.

Getting started

A data source backend plugin consists of both frontend and backend components.

Frontend

  1. Install dependencies

    yarn install
  2. Build plugin in development mode or run in watch mode

    yarn dev

    or

    yarn watch
  3. Build plugin in production mode

    yarn build

Backend

  1. Update Grafana plugin SDK for Go dependency to the latest minor version:

    go get -u github.com/grafana/grafana-plugin-sdk-go
    go mod tidy
  2. Build backend plugin binaries for Linux, Windows and Darwin:

    mage -v
  3. List all available Mage targets for additional commands:

    mage -l

Learn more

About

Grafana Databricks integration allowing direct connection to Databricks to query and visualize Databricks data in Grafana.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 68.1%
  • Go 24.2%
  • JavaScript 4.9%
  • Dockerfile 2.5%
  • Shell 0.3%