Fixes for new docs (#6167)

apache · Aug 13, 2018 · 3690cca · 3690cca
1 parent c15ceda
commit 3690cca
Show file tree

Hide file tree

Showing 18 changed files with 126 additions and 127 deletions.
diff --git a/docs/_redirects.json b/docs/_redirects.json
@@ -91,11 +91,11 @@
   {"source": "comparisons/druid-vs-hadoop.html", "target": "druid-vs-sql-on-hadoop.html"},
   {"source": "comparisons/druid-vs-impala-or-shark.html", "target": "druid-vs-sql-on-hadoop.html"},
   {"source": "comparisons/druid-vs-vertica.html", "target": "druid-vs-redshift.html"},
-  {"source": "configuration/broker.html", "target": "configuration/index.html#broker"},
-  {"source": "configuration/caching.html", "target": "configuration/index.html#cache-configuration"},
-  {"source": "configuration/coordinator.html", "target": "configuration/index.html#coordinator"},
-  {"source": "configuration/historical.html", "target": "configuration/index.html#historical"},
-  {"source": "configuration/indexing-service.html", "target": "configuration/index.html#overlord"},
+  {"source": "configuration/broker.html", "target": "../configuration/index.html#broker"},
+  {"source": "configuration/caching.html", "target": "../configuration/index.html#cache-configuration"},
+  {"source": "configuration/coordinator.html", "target": "../configuration/index.html#coordinator"},
+  {"source": "configuration/historical.html", "target": "../configuration/index.html#historical"},
+  {"source": "configuration/indexing-service.html", "target": "../configuration/index.html#overlord"},
   {"source": "configuration/simple-cluster.html", "target": "../tutorials/cluster.html"},
   {"source": "design/concepts-and-terminology.html", "target": "index.html"},
   {"source": "development/approximate-histograms.html", "target": "extensions-core/approximate-histograms.html"},

diff --git a/docs/content/configuration/index.md b/docs/content/configuration/index.md
@@ -7,8 +7,9 @@ layout: doc_page
 This page documents all of the configuration properties for each Druid service type.
 
 ## Table of Contents
+  * [Recommended Configuration File Organization](#recommended-configuration-file-organization)
   * [Common configurations](#common-configurations)
-    * [JVM Configuration Best Practices](#jvm-configuration-best-practices]
+    * [JVM Configuration Best Practices](#jvm-configuration-best-practices)
     * [Extensions](#extensions)
     * [Modules](#modules)
     * [Zookeeper](#zookeper)

diff --git a/docs/content/ingestion/batch-ingestion.md b/docs/content/ingestion/batch-ingestion.md
@@ -8,7 +8,7 @@ Druid can load data from static files through a variety of methods described her
 
 ## Native Batch Ingestion
 
-Druid has built-in batch ingestion functionality. See [here](../ingestion/native_tasks.html) for more info.
+Druid has built-in batch ingestion functionality. See [here](../ingestion/native-batch.html) for more info.
 
 ## Hadoop Batch Ingestion
 

diff --git a/docs/content/ingestion/overview.md b/docs/content/ingestion/overview.md
@@ -153,7 +153,7 @@ the best one for your situation.
 
 |Method|How it works|Can append and overwrite?|Can handle late data?|Exactly-once ingestion?|Real-time queries?|
 |------|------------|-------------------------|---------------------|-----------------------|------------------|
-|[Native batch](native_tasks.html)|Druid loads data directly from S3, HTTP, NFS, or other networked storage.|Append or overwrite|Yes|Yes|No|
+|[Native batch](native-batch.html)|Druid loads data directly from S3, HTTP, NFS, or other networked storage.|Append or overwrite|Yes|Yes|No|
 |[Hadoop](hadoop.html)|Druid launches Hadoop Map/Reduce jobs to load data files.|Append or overwrite|Yes|Yes|No|
 |[Kafka indexing service](../development/extensions-core/kafka-ingestion.html)|Druid reads directly from Kafka.|Append only|Yes|Yes|Yes|
 |[Tranquility](stream-push.html)|You use Tranquility, a client side library, to push individual records into Druid.|Append only|No - late data is dropped|No - may drop or duplicate data|Yes|

diff --git a/docs/content/toc.md b/docs/content/toc.md
@@ -18,12 +18,13 @@ layout: toc
     * [Tutorial: Loading stream data using HTTP push](/docs/VERSION/tutorials/tutorial-tranquility.html)
     * [Tutorial: Querying data](/docs/VERSION/tutorials/tutorial-query.html)
   * [Further tutorials](/docs/VERSION/tutorials/advanced.html)
-    * [Tutorial: Rollup](/docs/VERSION/tutorials/rollup.html)
+    * [Tutorial: Rollup](/docs/VERSION/tutorials/tutorial-rollup.html)
     * [Tutorial: Configuring retention](/docs/VERSION/tutorials/tutorial-retention.html)
     * [Tutorial: Updating existing data](/docs/VERSION/tutorials/tutorial-update-data.html)
     * [Tutorial: Compacting segments](/docs/VERSION/tutorials/tutorial-compaction.html)
     * [Tutorial: Deleting data](/docs/VERSION/tutorials/tutorial-delete-data.html)
     * [Tutorial: Writing your own ingestion specs](/docs/VERSION/tutorials/tutorial-ingestion-spec.html)
+    * [Tutorial: Transforming input data](/docs/VERSION/tutorials/tutorial-transform-spec.html)
   * [Clustering](/docs/VERSION/tutorials/cluster.html)
 
 ## Data Ingestion
@@ -33,8 +34,8 @@ layout: toc
   * [Schema Design](/docs/VERSION/ingestion/schema-design.html)
   * [Schema Changes](/docs/VERSION/ingestion/schema-changes.html)
   * [Batch File Ingestion](/docs/VERSION/ingestion/batch-ingestion.html)
-    * [Native Batch Ingestion](docs/VERSION/ingestion/native-batch.html)
-    * [Hadoop Batch Ingestion](docs/VERSION/ingestion/hadoop.html)
+    * [Native Batch Ingestion](/docs/VERSION/ingestion/native-batch.html)
+    * [Hadoop Batch Ingestion](/docs/VERSION/ingestion/hadoop.html)
   * [Stream Ingestion](/docs/VERSION/ingestion/stream-ingestion.html)
     * [Stream Push](/docs/VERSION/ingestion/stream-push.html)
     * [Stream Pull](/docs/VERSION/ingestion/stream-pull.html)

diff --git a/docs/content/tutorials/index.md b/docs/content/tutorials/index.md
@@ -50,7 +50,7 @@ Before proceeding, please download the [tutorial examples package](../tutorials/
 
 This tarball contains sample data and ingestion specs that will be used in the tutorials. 
 
-```
+```bash
 curl -O http://druid.io/docs/#{DRUIDVERSION}/tutorials/tutorial-examples.tar.gz
 tar zxvf tutorial-examples.tar.gz
 ```
@@ -98,7 +98,8 @@ Later on, if you'd like to stop the services, CTRL-C to exit from the running ja
 want a clean start after stopping the services, delete the `log` and `var` directory and run the `init` script again.
 
 From the druid-#{DRUIDVERSION} directory:
-```
+
+```bash
 rm -rf log
 rm -rf var
 bin/init
@@ -134,7 +135,7 @@ The sample data has the following columns, and an example event is shown below:
   * regionName
   * user
 
-```
+```json
 {
   "timestamp":"2015-09-12T20:03:45.018Z",
   "channel":"#en.wikipedia",
@@ -164,18 +165,18 @@ The following tutorials demonstrate various methods of loading data into Druid,
 
 This tutorial demonstrates how to perform a batch file load, using Druid's native batch ingestion.
 
-### [Tutorial: Loading stream data from Kafka](../tutorial-kafka.html)
+### [Tutorial: Loading stream data from Kafka](./tutorial-kafka.html)
 
 This tutorial demonstrates how to load streaming data from a Kafka topic.
 
-### [Tutorial: Loading a file using Hadoop](../tutorial-batch-hadoop.html)
+### [Tutorial: Loading a file using Hadoop](./tutorial-batch-hadoop.html)
 
 This tutorial demonstrates how to perform a batch file load, using a remote Hadoop cluster.
 
-### [Tutorial: Loading data using Tranquility](../tutorial-tranquility.html)
+### [Tutorial: Loading data using Tranquility](./tutorial-tranquility.html)
 
 This tutorial demonstrates how to load streaming data by pushing events to Druid using the Tranquility service.
 
-### [Tutorial: Writing your own ingestion spec](../tutorial-ingestion-spec.html)
+### [Tutorial: Writing your own ingestion spec](./tutorial-ingestion-spec.html)
 
 This tutorial demonstrates how to write a new ingestion spec and use it to load data.
diff --git a/docs/content/tutorials/tutorial-batch-hadoop.md b/docs/content/tutorials/tutorial-batch-hadoop.md
@@ -20,9 +20,9 @@ For this tutorial, we've provided a Dockerfile for a Hadoop 2.7.3 cluster, which
 
 This Dockerfile and related files are located at `examples/hadoop/docker`.
 
-From the druid-${DRUIDVERSION} package root, run the following commands to build a Docker image named "druid-hadoop-demo" with version tag "2.7.3":
+From the druid-#{DRUIDVERSION} package root, run the following commands to build a Docker image named "druid-hadoop-demo" with version tag "2.7.3":
 
-```
+```bash
 cd examples/hadoop/docker
 docker build -t druid-hadoop-demo:2.7.3 .
 ```
@@ -37,7 +37,7 @@ We'll need a shared folder between the host and the Hadoop container for transfe
 
 Let's create some folders under `/tmp`, we will use these later when starting the Hadoop container:
 
-```
+```bash
 mkdir -p /tmp/shared
 mkdir -p /tmp/shared/hadoop-xml
 ```
@@ -54,13 +54,13 @@ On the host machine, add the following entry to `/etc/hosts`:
 
 Once the `/tmp/shared` folder has been created and the `etc/hosts` entry has been added, run the following command to start the Hadoop container.
 
-```
+```bash
 docker run -it  -h druid-hadoop-demo -p 50010:50010 -p 50020:50020 -p 50075:50075 -p 50090:50090 -p 8020:8020 -p 10020:10020 -p 19888:19888 -p 8030:8030 -p 8031:8031 -p 8032:8032 -p 8033:8033 -p 8040:8040 -p 8042:8042 -p 8088:8088 -p 8443:8443 -p 2049:2049 -p 9000:9000 -p 49707:49707 -p 2122:2122  -p 34455:34455 -v /tmp/shared:/shared druid-hadoop-demo:2.7.3 /etc/bootstrap.sh -bash
 ```
 
 Once the container is started, your terminal will attach to a bash shell running inside the container:
 
-```
+```bash
 Starting sshd:                                             [  OK  ]
 18/07/26 17:27:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
 Starting namenodes on [druid-hadoop-demo]
@@ -80,17 +80,17 @@ The `Unable to load native-hadoop library for your platform... using builtin-jav
 
 ### Copy input data to the Hadoop container
 
-From the druid-${DRUIDVERSION} package root on the host, copy the `quickstart/wikiticker-2015-09-12-sampled.json.gz` sample data to the shared folder:
+From the druid-#{DRUIDVERSION} package root on the host, copy the `quickstart/wikiticker-2015-09-12-sampled.json.gz` sample data to the shared folder:
 
-```
+```bash
 cp quickstart/wikiticker-2015-09-12-sampled.json.gz /tmp/shared/wikiticker-2015-09-12-sampled.json.gz
 ```
 
 ### Setup HDFS directories
 
 In the Hadoop container's shell, run the following commands to setup the HDFS directories needed by this tutorial and copy the input data to HDFS.
 
-```
+```bash
 cd /usr/local/hadoop/bin
 ./hadoop fs -mkdir /druid
 ./hadoop fs -mkdir /druid/segments
@@ -113,13 +113,13 @@ Some additional steps are needed to configure the Druid cluster for Hadoop batch
 
 From the Hadoop container's shell, run the following command to copy the Hadoop .xml configuration files to the shared folder:
 
-```
+```bash
 cp /usr/local/hadoop/etc/hadoop/*.xml /shared/hadoop-xml
 ```
 
 From the host machine, run the following, where {PATH_TO_DRUID} is replaced by the path to the Druid package.
 
-```
+```bash
 cp /tmp/shared/hadoop-xml/*.xml {PATH_TO_DRUID}/examples/conf/druid/_common/hadoop-xml/
 ```
 
@@ -201,14 +201,14 @@ indicating "fully available": [http://localhost:8081/#/](http://localhost:8081/#
 Your data should become fully available within a minute or two after the task completes. You can monitor this process on 
 your Coordinator console at [http://localhost:8081/#/](http://localhost:8081/#/).
 
-Please follow the [query tutorial](../tutorial/tutorial-query.html) to run some example queries on the newly loaded data.
+Please follow the [query tutorial](../tutorials/tutorial-query.html) to run some example queries on the newly loaded data.
 
 ## Cleanup
 
-This tutorial is only meant to be used together with the [query tutorial](../tutorial/tutorial-query.html). 
+This tutorial is only meant to be used together with the [query tutorial](../tutorials/tutorial-query.html). 
 
 If you wish to go through any of the other tutorials, you will need to:
-* Shut down the cluster and reset the cluster state by following the [reset instructions](index.html#resetting-the-cluster).
+* Shut down the cluster and reset the cluster state by following the [reset instructions](index.html#resetting-cluster-state).
 * Revert the deep storage and task storage config back to local types in `examples/conf/druid/_common/common.runtime.properties`
 * Restart the cluster
 

diff --git a/docs/content/tutorials/tutorial-batch.md b/docs/content/tutorials/tutorial-batch.md
@@ -19,7 +19,7 @@ A data load is initiated by submitting an *ingestion task* spec to the Druid ove
 We have provided an ingestion spec at `examples/wikipedia-index.json`, shown here for convenience,
 which has been configured to read the `quickstart/wikiticker-2015-09-12-sampled.json.gz` input file:
 
-```
+```json
 {
   "type" : "index",
   "spec" : {
@@ -121,11 +121,11 @@ indicating "fully available": [http://localhost:8081/#/](http://localhost:8081/#
 Your data should become fully available within a minute or two. You can monitor this process on 
 your Coordinator console at [http://localhost:8081/#/](http://localhost:8081/#/).
 
-Once the data is loaded, please follow the [query tutorial](../tutorial/tutorial-query.html) to run some example queries on the newly loaded data.
+Once the data is loaded, please follow the [query tutorial](../tutorials/tutorial-query.html) to run some example queries on the newly loaded data.
 
 ## Cleanup
 
-If you wish to go through any of the other ingestion tutorials, you will need to reset the cluster and follow these [reset instructions](index.html#resetting-the-cluster), as the other tutorials will write to the same "wikipedia" datasource.
+If you wish to go through any of the other ingestion tutorials, you will need to reset the cluster and follow these [reset instructions](index.html#resetting-cluster-state), as the other tutorials will write to the same "wikipedia" datasource.
 
 ## Further reading
 

diff --git a/docs/content/tutorials/tutorial-compaction.md b/docs/content/tutorials/tutorial-compaction.md
@@ -11,15 +11,15 @@ Because there is some per-segment memory and processing overhead, it can sometim
 For this tutorial, we'll assume you've already downloaded Druid as described in 
 the [single-machine quickstart](index.html) and have it running on your local machine. 
 
-It will also be helpful to have finished [Tutorial: Loading a file](/docs/VERSION/tutorials/tutorial-batch.html) and [Tutorial: Querying data](/docs/VERSION/tutorials/tutorial-query.html).
+It will also be helpful to have finished [Tutorial: Loading a file](../tutorials/tutorial-batch.html) and [Tutorial: Querying data](../tutorials/tutorial-query.html).
 
 ## Load the initial data
 
 For this tutorial, we'll be using the Wikipedia edits sample data, with an ingestion task spec that will create a separate segment for each hour in the input data.
 
 The ingestion spec can be found at `examples/compaction-init-index.json`. Let's submit that spec, which will create a datasource called `compaction-tutorial`:
 
-```
+```bash
 curl -X 'POST' -H 'Content-Type:application/json' -d @examples/compaction-init-index.json http://localhost:8090/druid/indexer/v1/task
 ```
 
@@ -35,7 +35,7 @@ Running a COUNT(*) query on this datasource shows that there are 39,244 rows:
 curl -X 'POST' -H 'Content-Type:application/json' -d @examples/compaction-count-sql.json http://localhost:8082/druid/v2/sql
 ```
 
-```
+```json
 [{"EXPR$0":39244}]
 ```
 
@@ -45,7 +45,7 @@ Let's now combine these 24 segments into one segment.
 
 We have included a compaction task spec for this tutorial datasource at `examples/compaction-final-index.json`:
 
-```
+```json
 {
   "type": "compact",
   "dataSource": "compaction-tutorial",
@@ -67,7 +67,7 @@ In this tutorial example, only one compacted segment will be created, as the 392
 
 Let's submit this task now:
 
-```
+```json
 curl -X 'POST' -H 'Content-Type:application/json' -d @examples/compaction-final-index.json http://localhost:8090/druid/indexer/v1/task
 ```
 
@@ -88,7 +88,7 @@ Let's try running a COUNT(*) on `compaction-tutorial` again, where the row count
 curl -X 'POST' -H 'Content-Type:application/json' -d @examples/compaction-count-sql.json http://localhost:8082/druid/v2/sql
 ```
 
-```
+```json
 [{"EXPR$0":39244}]
 ```
 

diff --git a/docs/content/tutorials/tutorial-delete-data.md b/docs/content/tutorials/tutorial-delete-data.md
@@ -9,15 +9,15 @@ This tutorial demonstrates how to delete existing data.
 For this tutorial, we'll assume you've already downloaded Druid as described in 
 the [single-machine quickstart](index.html) and have it running on your local machine. 
 
-Completing [Tutorial: Configuring retention](/docs/VERSION/tutorials/tutorial-retention.html) first is highly recommended, as we will be using retention rules in this tutorial.
+Completing [Tutorial: Configuring retention](../tutorials/tutorial-retention.html) first is highly recommended, as we will be using retention rules in this tutorial.
 
 ## Load initial data
 
 In this tutorial, we will use the Wikipedia edits data, with an indexing spec that creates hourly segments. This spec is located at `examples/deletion-index.json`, and it creates a datasource called `deletion-tutorial`.
 
 Let's load this initial data:
 
-```
+```bash
 curl -X 'POST' -H 'Content-Type:application/json' -d @examples/deletion-index.json http://localhost:8090/druid/indexer/v1/task
 ```
 
@@ -48,9 +48,9 @@ In the `rule #2` box at the bottom, click `Drop` and `Forever`.
 
 This will cause the first 12 segments of `deletion-tutorial` to be dropped. However, these dropped segments are not removed from deep storage.
 
-You can see that all 24 segments are still present in deep storage by listing the contents of `druid-{DRUIDVERSION}/var/druid/segments/deletion-tutorial`:
+You can see that all 24 segments are still present in deep storage by listing the contents of `var/druid/segments/deletion-tutorial`:
 
-```
+```bash
 $ ls -l1 var/druid/segments/deletion-tutorial/
 2015-09-12T00:00:00.000Z_2015-09-12T01:00:00.000Z
 2015-09-12T01:00:00.000Z_2015-09-12T02:00:00.000Z
@@ -90,7 +90,7 @@ The top of the info box shows the full segment ID, e.g. `deletion-tutorial_2016-
 
 Let's disable the hour 14 segment by sending the following DELETE request to the coordinator, where {SEGMENT-ID} is the full segment ID shown in the info box:
 
-```
+```bash
 curl -XDELETE http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/segments/{SEGMENT-ID}
 ```
 
@@ -100,7 +100,7 @@ After that command completes, you should see that the segment for hour 14 has be
 
 Note that the hour 14 segment is still in deep storage:
 
-```
+```bash
 $ ls -l1 var/druid/segments/deletion-tutorial/
 2015-09-12T00:00:00.000Z_2015-09-12T01:00:00.000Z
 2015-09-12T01:00:00.000Z_2015-09-12T02:00:00.000Z
@@ -134,13 +134,13 @@ Now that we have disabled some segments, we can submit a Kill Task, which will d
 
 A Kill Task spec has been provided at `examples/deletion-kill.json`. Submit this task to the Overlord with the following command:
 
-```
+```bash
 curl -X 'POST' -H 'Content-Type:application/json' -d @examples/deletion-kill.json http://localhost:8090/druid/indexer/v1/task
 ```
 
 After this task completes, you can see that the disabled segments have now been removed from deep storage:
 
-```
+```bash
 $ ls -l1 var/druid/segments/deletion-tutorial/
 2015-09-12T12:00:00.000Z_2015-09-12T13:00:00.000Z
 2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z