Building the Druid Index for the TPCH Dataset using the Local Indexing Service.

This assumes that you have setup Druid and have created the Denormalized dataset

In the following we describe the procedure to use the Druid Indexing Service in local mode. Use this procedure when indexing a small dataset in a dev. environment. For a production environment use the HadoopDruidIndexer

Ensure that the Druid overlord service is running. Then issue a POST like the following:

curl -X 'POST' -H 'Content-Type:application/json' \
-d @/Users/hbutani/sparkline/tpch-spark-druid/druid/tpch_index_task.json \
localhost:8090/druid/indexer/v1/task

The overlord listens on port 8090 and indexing commands can be posted to it. The Index Json in this case points the TPCH datascale1 denormalized dataset.

The Status of the Indexing can be viewed at its console. Note that the local indexing service takes several hours to index even the datascale 1 TPCH dataset. For development purposes consider indexing only a small sample/subset of the datascale1 dataset.

Overview
Quick Start
- Installing and Setup Druid
User Guide
- [Defining a DataSource on a Flattened Dataset](https://github.com/SparklineData/spark-druid-olap/wiki/Defining-a Druid-DataSource-on-a-Flattened-Dataset)
- Defining a Star Schema
- Sample Queries
- Approximate Count and Spatial Queries
- Druid Datasource Options
- Sparkline SQLContext Options
- Using Tableau with Sparkline
- How to debug a Query Plan?
- Running the ThriftServer with Sparklinedata components
- [Setting up multiple Sparkline ThriftServers - Load Balancing & HA] (https://github.com/SparklineData/spark-druid-olap/wiki/Setting-up-multiple-Sparkline-ThriftServers-(Load-Balancing-&-HA))
- Runtime Views
- Sparkline SQL extensions
- Sparkline Pluggable Modules
Dev. Guide
Reference Architectures
- Accelerating existing SQL Datasets
Releases
Cluster Spinup Tool
TPCH Benchmark
- Generating Denormalized TPCH Dataset
- Build TPCH Index for Benchmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building the Druid Index for the TPCH Dataset using the Local Indexing Service.

Clone this wiki locally