From 9cbc4b209ca13e54aeff634f7c1870f5770d44bf Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Wed, 11 Sep 2024 15:59:51 +0200 Subject: [PATCH 1/2] Add descriptions --- .../pages/getting_started/first_steps.adoc | 21 ++++++++++++------- .../druid/pages/getting_started/index.adoc | 1 + .../pages/getting_started/installation.adoc | 1 + docs/modules/druid/pages/index.adoc | 2 +- .../pages/required-external-components.adoc | 9 ++++++-- ...nfiguration-and-environment-overrides.adoc | 1 + .../druid/pages/usage-guide/deep-storage.adoc | 3 ++- .../druid/pages/usage-guide/extensions.adoc | 1 + .../druid/pages/usage-guide/ingestion.adoc | 1 + .../pages/usage-guide/listenerclass.adoc | 1 + .../druid/pages/usage-guide/logging.adoc | 1 + .../druid/pages/usage-guide/monitoring.adoc | 1 + .../usage-guide/resources-and-storage.adoc | 1 + .../druid/pages/usage-guide/security.adoc | 1 + 14 files changed, 34 insertions(+), 11 deletions(-) diff --git a/docs/modules/druid/pages/getting_started/first_steps.adoc b/docs/modules/druid/pages/getting_started/first_steps.adoc index 056cb208..f29dacfd 100644 --- a/docs/modules/druid/pages/getting_started/first_steps.adoc +++ b/docs/modules/druid/pages/getting_started/first_steps.adoc @@ -1,6 +1,8 @@ = First steps +:description: Set up a Druid cluster using the Stackable Operator by installing ZooKeeper, HDFS, and Druid. Ingest and query example data via the web UI or API. -After going through the xref:getting_started/installation.adoc[] section and having installed all the Operators, you will now deploy a Druid cluster and it's dependencies. Afterwards you can <<_verify_that_it_works, verify that it works>> by ingesting example data and subsequently query it. +After going through the xref:getting_started/installation.adoc[] section and having installed all the Operators, you will now deploy a Druid cluster and it's dependencies. +Afterwards you can <<_verify_that_it_works, verify that it works>> by ingesting example data and subsequently query it. == Setup @@ -10,7 +12,8 @@ Three things need to be installed to have a Druid cluster: * An HDFS instance to be used as a backend for deep storage * The Druid cluster itself -We will create them in this order, each one is created by applying a manifest file. The Operators you just installed will then create the resources according to the manifest. +We will create them in this order, each one is created by applying a manifest file. +The Operators you just installed will then create the resources according to the manifest. === ZooKeeper @@ -61,11 +64,13 @@ include::example$getting_started/getting_started.sh[tag=install-druid] This will create the actual druid instance. -WARNING: This Druid instance uses Derby (`dbType: derby`) as a metadata store, which is an interal SQL database. It is not persisted and not suitable for production use! Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions for production instances. +WARNING: This Druid instance uses Derby (`dbType: derby`) as a metadata store, which is an interal SQL database. +It is not persisted and not suitable for production use! +Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions for production instances. == Verify that it works -Next you will submit an ingestion job and then query the ingested data - either through the web interface or the API. +Next you will submit an ingestion job and then query the ingested data - either through the web interface or the API. First, make sure that all the Pods in the StatefulSets are ready: @@ -97,7 +102,8 @@ include::example$getting_started/getting_started.sh[tag=port-forwarding] === Ingest example data -Next, we will ingest some example data using the web interface. If you prefer to use the command line instead, follow the instructions in the collapsed section below. +Next, we will ingest some example data using the web interface. +If you prefer to use the command line instead, follow the instructions in the collapsed section below. [#ingest-cmd-line] @@ -128,7 +134,8 @@ Now load the example data: image::getting_started/load_example.png[] -Click through all pages of the load process. You can also follow the https://druid.apache.org/docs/latest/tutorials/index.html#step-4-load-data[Druid Quickstart Guide]. +Click through all pages of the load process. +You can also follow the https://druid.apache.org/docs/latest/tutorials/index.html#step-4-load-data[Druid Quickstart Guide]. Once you finished the ingestion dialog you should see the ingestion overview with the job, which will eventually show SUCCESS: @@ -136,7 +143,7 @@ image::getting_started/load_success.png[] === Query the data -Query from the user interface by navigating to the "Query" interface in the menu and query the `wikipedia` table: +Query from the user interface by navigating to the "Query" interface in the menu and query the `wikipedia` table: [#query-cmd-line] .Alternative: Using the command line diff --git a/docs/modules/druid/pages/getting_started/index.adoc b/docs/modules/druid/pages/getting_started/index.adoc index f3cc0c5f..b353bb1e 100644 --- a/docs/modules/druid/pages/getting_started/index.adoc +++ b/docs/modules/druid/pages/getting_started/index.adoc @@ -1,4 +1,5 @@ = Getting started +:description: Get started with Druid on Kubernetes using the Stackable Operator. Follow steps to install, configure, and query data. This guide will get you started with Druid using the Stackable Operator. It will guide you through the installation of the Operator and its dependencies, setting up your first Druid instance and connecting to it, ingesting example data and querying that data. diff --git a/docs/modules/druid/pages/getting_started/installation.adoc b/docs/modules/druid/pages/getting_started/installation.adoc index ca429690..07f28129 100644 --- a/docs/modules/druid/pages/getting_started/installation.adoc +++ b/docs/modules/druid/pages/getting_started/installation.adoc @@ -1,4 +1,5 @@ = Installation +:description: Install the Stackable Druid Operator and its dependencies on Kubernetes using stackablectl or Helm. On this page you will install the Stackable Druid Operator and Operators for its dependencies - ZooKeeper and HDFS - as well as the commons, secret and listener operator which are required by all Stackable Operators. diff --git a/docs/modules/druid/pages/index.adoc b/docs/modules/druid/pages/index.adoc index 630374d6..89f52d3b 100644 --- a/docs/modules/druid/pages/index.adoc +++ b/docs/modules/druid/pages/index.adoc @@ -1,5 +1,5 @@ = Stackable Operator for Apache Druid -:description: The Stackable Operator for Apache Druid is a Kubernetes operator that can manage Apache Druid clusters. Learn about its features, resources, dependencies, and demos, and see the list of supported Druid versions. +:description: The Stackable Operator for Apache Druid is a Kubernetes operator that manages Druid clusters, handling setup, dependencies, and integration with tools like Trino. :keywords: Stackable Operator, Apache Druid, Kubernetes, operator, DevOps, CRD, ZooKeeper, HDFS, S3, Kafka, Trino, OPA :github: https://github.com/stackabletech/druid-operator/ :crd: {crd-docs-base-url}/druid-operator/{crd-docs-version}/ diff --git a/docs/modules/druid/pages/required-external-components.adoc b/docs/modules/druid/pages/required-external-components.adoc index 2d9bcf38..c040b22c 100644 --- a/docs/modules/druid/pages/required-external-components.adoc +++ b/docs/modules/druid/pages/required-external-components.adoc @@ -1,9 +1,14 @@ # Required external components +:description: Druid requires an SQL database for metadata and supports various deep storage options like S3, HDFS, and cloud storage -Druid uses an SQL database to store metadata. Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions. +Druid uses an SQL database to store metadata. +Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions. ## Feature specific: S3 and cloud deep storage -https://druid.apache.org/docs/latest/dependencies/deep-storage.html[Deep storage] is where segments are stored. Druid offers multiple storage backends. For the local storage there are no prerequisites. HDFS deep storage can be set up with the xref:hdfs:index.adoc[Stackable Operator for Apache HDFS]. For S3 deep storage or the Google Cloud and Azure storage backends, you need to set up the storage. +https://druid.apache.org/docs/latest/dependencies/deep-storage.html[Deep storage] is where segments are stored. +Druid offers multiple storage backends. For the local storage there are no prerequisites. +HDFS deep storage can be set up with the xref:hdfs:index.adoc[Stackable Operator for Apache HDFS]. +For S3 deep storage or the Google Cloud and Azure storage backends, you need to set up the storage. Read the xref:usage-guide/deep-storage.adoc[deep storage usage guide] to learn more about configuring Druid deep storage. diff --git a/docs/modules/druid/pages/usage-guide/configuration-and-environment-overrides.adoc b/docs/modules/druid/pages/usage-guide/configuration-and-environment-overrides.adoc index 41349766..9760c568 100644 --- a/docs/modules/druid/pages/usage-guide/configuration-and-environment-overrides.adoc +++ b/docs/modules/druid/pages/usage-guide/configuration-and-environment-overrides.adoc @@ -1,4 +1,5 @@ = Configuration & Environment Overrides +:description: Override Druid configuration properties and environment variables per role or role group. Customize runtime.properties, jvm.config, and security.properties as needed. The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role). diff --git a/docs/modules/druid/pages/usage-guide/deep-storage.adoc b/docs/modules/druid/pages/usage-guide/deep-storage.adoc index 4f1a6359..4ebb29c7 100644 --- a/docs/modules/druid/pages/usage-guide/deep-storage.adoc +++ b/docs/modules/druid/pages/usage-guide/deep-storage.adoc @@ -1,4 +1,5 @@ = Deep storage configuration +:description: Configure Apache Druid deep storage with HDFS or S3. Set up HDFS via a ConfigMap, or use S3 with inline or referenced bucket details. https://druid.apache.org/docs/latest/design/deep-storage/[Deep Storage] is where Druid stores data segments. For a Kubernetes environment, either the HDFS or S3 backend is recommended. @@ -19,7 +20,7 @@ spec: directory: /druid # <2> ... ---- -<1> Name of the HDFS cluster discovery config map. Can be supplied manually for a cluster not provided by Stackable. Needs to contain the `core-site.xml` and `hdfs-site.xml`. +<1> Name of the HDFS cluster discovery ConfigMap. Can be supplied manually for a cluster not provided by Stackable. Needs to contain the `core-site.xml` and `hdfs-site.xml`. <2> The directory where to store the druid data. == [[s3]]S3 diff --git a/docs/modules/druid/pages/usage-guide/extensions.adoc b/docs/modules/druid/pages/usage-guide/extensions.adoc index e48f6ba1..b4bdcbe7 100644 --- a/docs/modules/druid/pages/usage-guide/extensions.adoc +++ b/docs/modules/druid/pages/usage-guide/extensions.adoc @@ -1,6 +1,7 @@ = Druid extensions :druid-extensions: https://druid.apache.org/docs/latest/configuration/extensions/ :druid-community-extensions: https://druid.apache.org/docs/latest/configuration/extensions/#loading-community-extensions +:description: Add functionality to Druid with default or custom extensions. Default extensions include Kafka and HDFS support; community extensions require extra setup. {druid-extensions}[Druid extensions] are used to provide additional functionality at runtime, e.g. for data formats or different types of deep storage. diff --git a/docs/modules/druid/pages/usage-guide/ingestion.adoc b/docs/modules/druid/pages/usage-guide/ingestion.adoc index 1957cfa8..f715d610 100644 --- a/docs/modules/druid/pages/usage-guide/ingestion.adoc +++ b/docs/modules/druid/pages/usage-guide/ingestion.adoc @@ -1,4 +1,5 @@ = Ingestion +:description: Ingest data from S3 by specifying the host and optional credentials. Add external files to Druid pods using extra volumes for client certificates or keytabs. == [[s3]]From S3 diff --git a/docs/modules/druid/pages/usage-guide/listenerclass.adoc b/docs/modules/druid/pages/usage-guide/listenerclass.adoc index e1babfac..c2ef0d61 100644 --- a/docs/modules/druid/pages/usage-guide/listenerclass.adoc +++ b/docs/modules/druid/pages/usage-guide/listenerclass.adoc @@ -1,4 +1,5 @@ = Service exposition with ListenerClasses +:description: Configure Apache Druid service exposure using ListenerClass to control service types: cluster-internal, external-unstable, or external-stable. Apache Druid offers a web UI and an API, both are exposed by the `router` role. Other roles also expose API endpoints such as the `broker` and `coordinator`. diff --git a/docs/modules/druid/pages/usage-guide/logging.adoc b/docs/modules/druid/pages/usage-guide/logging.adoc index defa68d3..83388b81 100644 --- a/docs/modules/druid/pages/usage-guide/logging.adoc +++ b/docs/modules/druid/pages/usage-guide/logging.adoc @@ -1,4 +1,5 @@ = Log aggregation +:description: Forward logs to a Vector aggregator by enabling the log agent and specifying a discovery ConfigMap. The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent: diff --git a/docs/modules/druid/pages/usage-guide/monitoring.adoc b/docs/modules/druid/pages/usage-guide/monitoring.adoc index 8abe784e..df38bfcd 100644 --- a/docs/modules/druid/pages/usage-guide/monitoring.adoc +++ b/docs/modules/druid/pages/usage-guide/monitoring.adoc @@ -1,4 +1,5 @@ = Monitoring +:description: Managed Druid instances export Prometheus metrics by default for easy monitoring. The managed Druid instances are automatically configured to export Prometheus metrics. See xref:operators:monitoring.adoc[] for more details. diff --git a/docs/modules/druid/pages/usage-guide/resources-and-storage.adoc b/docs/modules/druid/pages/usage-guide/resources-and-storage.adoc index 6cd01580..086c5b10 100644 --- a/docs/modules/druid/pages/usage-guide/resources-and-storage.adoc +++ b/docs/modules/druid/pages/usage-guide/resources-and-storage.adoc @@ -1,4 +1,5 @@ = Storage and resource configuration +:description: Configure storage and resource requests for Druid with default settings for CPU, memory, and additional settings for historical segment caches. == Storage for data volumes diff --git a/docs/modules/druid/pages/usage-guide/security.adoc b/docs/modules/druid/pages/usage-guide/security.adoc index 611055f6..5850ec74 100644 --- a/docs/modules/druid/pages/usage-guide/security.adoc +++ b/docs/modules/druid/pages/usage-guide/security.adoc @@ -1,4 +1,5 @@ = Security +:description: Secure your Druid cluster with TLS encryption, LDAP, or OIDC authentication. Connect with OPA for policy-based authorization. The Druid cluster can be secured and protected in multiple ways. From 4223e6faa7f7771b0cc4aaa0b633e9e29247364f Mon Sep 17 00:00:00 2001 From: Felix Hennig Date: Thu, 12 Sep 2024 11:03:58 +0200 Subject: [PATCH 2/2] Update docs/modules/druid/pages/getting_started/first_steps.adoc Co-authored-by: Malte Sander --- docs/modules/druid/pages/getting_started/first_steps.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/druid/pages/getting_started/first_steps.adoc b/docs/modules/druid/pages/getting_started/first_steps.adoc index f29dacfd..18bb9e8c 100644 --- a/docs/modules/druid/pages/getting_started/first_steps.adoc +++ b/docs/modules/druid/pages/getting_started/first_steps.adoc @@ -1,7 +1,7 @@ = First steps :description: Set up a Druid cluster using the Stackable Operator by installing ZooKeeper, HDFS, and Druid. Ingest and query example data via the web UI or API. -After going through the xref:getting_started/installation.adoc[] section and having installed all the Operators, you will now deploy a Druid cluster and it's dependencies. +After going through the xref:getting_started/installation.adoc[] section and having installed all the Operators, you will now deploy a Druid cluster and its dependencies. Afterwards you can <<_verify_that_it_works, verify that it works>> by ingesting example data and subsequently query it. == Setup