stackabletech · fhennig · Sep 12, 2024 · Sep 11, 2024 · Sep 12, 2024
diff --git a/docs/modules/druid/pages/getting_started/first_steps.adoc b/docs/modules/druid/pages/getting_started/first_steps.adoc
@@ -1,6 +1,8 @@
 = First steps
+:description: Set up a Druid cluster using the Stackable Operator by installing ZooKeeper, HDFS, and Druid. Ingest and query example data via the web UI or API.
 
-After going through the xref:getting_started/installation.adoc[] section and having installed all the Operators, you will now deploy a Druid cluster and it's dependencies. Afterwards you can <<_verify_that_it_works, verify that it works>> by ingesting example data and subsequently query it.
+After going through the xref:getting_started/installation.adoc[] section and having installed all the Operators, you will now deploy a Druid cluster and its dependencies.
+Afterwards you can <<_verify_that_it_works, verify that it works>> by ingesting example data and subsequently query it.
 
 == Setup
 
@@ -10,7 +12,8 @@
 * An HDFS instance to be used as a backend for deep storage
 * The Druid cluster itself
 
-We will create them in this order, each one is created by applying a manifest file. The Operators you just installed will then create the resources according to the manifest.
+We will create them in this order, each one is created by applying a manifest file.
+The Operators you just installed will then create the resources according to the manifest.
 
 === ZooKeeper
 
@@ -61,11 +64,13 @@
 
 This will create the actual druid instance.
 
-WARNING: This Druid instance uses Derby (`dbType: derby`) as a metadata store, which is an interal SQL database. It is not persisted and not suitable for production use! Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions for production instances.
+WARNING: This Druid instance uses Derby (`dbType: derby`) as a metadata store, which is an interal SQL database.
+It is not persisted and not suitable for production use!
+Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions for production instances.
 
 == Verify that it works
 
-Next you will submit an ingestion job and then query the ingested data  - either through the web interface or the API.
+Next you will submit an ingestion job and then query the ingested data - either through the web interface or the API.
 
 First, make sure that all the Pods in the StatefulSets are ready:
 
@@ -97,7 +102,8 @@
 
 === Ingest example data
 
-Next, we will ingest some example data using the web interface. If you prefer to use the command line instead, follow the instructions in the collapsed section below.
+Next, we will ingest some example data using the web interface.
+If you prefer to use the command line instead, follow the instructions in the collapsed section below.
 
 
 [#ingest-cmd-line]
@@ -128,15 +134,16 @@
 
 image::getting_started/load_example.png[]
 
-Click through all pages of the load process. You can also follow the https://druid.apache.org/docs/latest/tutorials/index.html#step-4-load-data[Druid Quickstart Guide].
+Click through all pages of the load process.
+You can also follow the https://druid.apache.org/docs/latest/tutorials/index.html#step-4-load-data[Druid Quickstart Guide].
 
 Once you finished the ingestion dialog you should see the ingestion overview with the job, which will eventually show SUCCESS:
 
 image::getting_started/load_success.png[]
 
 === Query the data
 
-Query from the user interface by navigating to the  "Query" interface in the menu and query the `wikipedia` table:
+Query from the user interface by navigating to the "Query" interface in the menu and query the `wikipedia` table:
 
 [#query-cmd-line]
 .Alternative: Using the command line

diff --git a/docs/modules/druid/pages/getting_started/index.adoc b/docs/modules/druid/pages/getting_started/index.adoc
@@ -1,4 +1,5 @@
 = Getting started
+:description: Get started with Druid on Kubernetes using the Stackable Operator. Follow steps to install, configure, and query data.
 
 This guide will get you started with Druid using the Stackable Operator. It will guide you through the installation of the Operator and its dependencies, setting up your first Druid instance and connecting to it, ingesting example data and querying that data.
 

diff --git a/docs/modules/druid/pages/getting_started/installation.adoc b/docs/modules/druid/pages/getting_started/installation.adoc
@@ -1,4 +1,5 @@
 = Installation
+:description: Install the Stackable Druid Operator and its dependencies on Kubernetes using stackablectl or Helm.
 
 On this page you will install the Stackable Druid Operator and Operators for its dependencies - ZooKeeper and HDFS - as
 well as the commons, secret and listener operator which are required by all Stackable Operators.

diff --git a/docs/modules/druid/pages/index.adoc b/docs/modules/druid/pages/index.adoc
@@ -1,5 +1,5 @@
 = Stackable Operator for Apache Druid
-:description: The Stackable Operator for Apache Druid is a Kubernetes operator that can manage Apache Druid clusters. Learn about its features, resources, dependencies, and demos, and see the list of supported Druid versions.
+:description: The Stackable Operator for Apache Druid is a Kubernetes operator that manages Druid clusters, handling setup, dependencies, and integration with tools like Trino.
 :keywords: Stackable Operator, Apache Druid, Kubernetes, operator, DevOps, CRD, ZooKeeper, HDFS, S3, Kafka, Trino, OPA
 :github: https://github.com/stackabletech/druid-operator/
 :crd: {crd-docs-base-url}/druid-operator/{crd-docs-version}/

diff --git a/docs/modules/druid/pages/required-external-components.adoc b/docs/modules/druid/pages/required-external-components.adoc
@@ -1,9 +1,14 @@
 # Required external components
+:description: Druid requires an SQL database for metadata and supports various deep storage options like S3, HDFS, and cloud storage
 
-Druid uses an SQL database to store metadata. Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions.
+Druid uses an SQL database to store metadata.
+Consult the https://druid.apache.org/docs/latest/dependencies/metadata-storage.html#available-metadata-stores[Druid documentation] for a list of supported databases and setup instructions.
 
 ## Feature specific: S3 and cloud deep storage
 
-https://druid.apache.org/docs/latest/dependencies/deep-storage.html[Deep storage] is where segments are stored. Druid offers multiple storage backends. For the local storage there are no prerequisites. HDFS deep storage can be set up with the xref:hdfs:index.adoc[Stackable Operator for Apache HDFS]. For S3 deep storage or the Google Cloud and Azure storage backends, you need to set up the storage.
+https://druid.apache.org/docs/latest/dependencies/deep-storage.html[Deep storage] is where segments are stored.
+Druid offers multiple storage backends. For the local storage there are no prerequisites.
+HDFS deep storage can be set up with the xref:hdfs:index.adoc[Stackable Operator for Apache HDFS].
+For S3 deep storage or the Google Cloud and Azure storage backends, you need to set up the storage.
 
 Read the xref:usage-guide/deep-storage.adoc[deep storage usage guide] to learn more about configuring Druid deep storage.
diff --git a/docs/modules/druid/pages/usage-guide/configuration-and-environment-overrides.adoc b/docs/modules/druid/pages/usage-guide/configuration-and-environment-overrides.adoc
@@ -1,4 +1,5 @@
 = Configuration & Environment Overrides
+:description: Override Druid configuration properties and environment variables per role or role group. Customize runtime.properties, jvm.config, and security.properties as needed.
 
 The cluster definition also supports overriding configuration properties and environment variables, either per role or per role group, where the more specific override (role group) has precedence over the less specific one (role).
 

diff --git a/docs/modules/druid/pages/usage-guide/deep-storage.adoc b/docs/modules/druid/pages/usage-guide/deep-storage.adoc
@@ -1,4 +1,5 @@
 = Deep storage configuration
+:description: Configure Apache Druid deep storage with HDFS or S3. Set up HDFS via a ConfigMap, or use S3 with inline or referenced bucket details.
 
 https://druid.apache.org/docs/latest/design/deep-storage/[Deep Storage] is where Druid stores data segments.
 For a Kubernetes environment, either the HDFS or S3 backend is recommended.
@@ -19,7 +20,7 @@ spec:
  directory: /druid # <2>
 ...
 ----
-<1> Name of the HDFS cluster discovery config map. Can be supplied manually for a cluster not provided by Stackable. Needs to contain the `core-site.xml` and `hdfs-site.xml`.
+<1> Name of the HDFS cluster discovery ConfigMap. Can be supplied manually for a cluster not provided by Stackable. Needs to contain the `core-site.xml` and `hdfs-site.xml`.
 <2> The directory where to store the druid data.
 
 == [[s3]]S3

diff --git a/docs/modules/druid/pages/usage-guide/extensions.adoc b/docs/modules/druid/pages/usage-guide/extensions.adoc
@@ -1,6 +1,7 @@
 = Druid extensions
 :druid-extensions: https://druid.apache.org/docs/latest/configuration/extensions/
 :druid-community-extensions: https://druid.apache.org/docs/latest/configuration/extensions/#loading-community-extensions
+:description: Add functionality to Druid with default or custom extensions. Default extensions include Kafka and HDFS support; community extensions require extra setup.
 
 {druid-extensions}[Druid extensions] are used to provide additional functionality at runtime, e.g. for data formats or different types of deep storage.
 

diff --git a/docs/modules/druid/pages/usage-guide/ingestion.adoc b/docs/modules/druid/pages/usage-guide/ingestion.adoc
@@ -1,4 +1,5 @@
 = Ingestion
+:description: Ingest data from S3 by specifying the host and optional credentials. Add external files to Druid pods using extra volumes for client certificates or keytabs.
 
 == [[s3]]From S3
 

diff --git a/docs/modules/druid/pages/usage-guide/listenerclass.adoc b/docs/modules/druid/pages/usage-guide/listenerclass.adoc
@@ -1,4 +1,5 @@
 = Service exposition with ListenerClasses
+:description: Configure Apache Druid service exposure using ListenerClass to control service types: cluster-internal, external-unstable, or external-stable.
 
 Apache Druid offers a web UI and an API, both are exposed by the `router` role.
 Other roles also expose API endpoints such as the `broker` and `coordinator`.

diff --git a/docs/modules/druid/pages/usage-guide/logging.adoc b/docs/modules/druid/pages/usage-guide/logging.adoc
@@ -1,4 +1,5 @@
 = Log aggregation
+:description: Forward logs to a Vector aggregator by enabling the log agent and specifying a discovery ConfigMap.
 
 The logs can be forwarded to a Vector log aggregator by providing a discovery ConfigMap for the aggregator and by enabling the log agent:
 

diff --git a/docs/modules/druid/pages/usage-guide/monitoring.adoc b/docs/modules/druid/pages/usage-guide/monitoring.adoc
@@ -1,4 +1,5 @@
 = Monitoring
+:description: Managed Druid instances export Prometheus metrics by default for easy monitoring.
 
 The managed Druid instances are automatically configured to export Prometheus metrics.
 See xref:operators:monitoring.adoc[] for more details.
diff --git a/docs/modules/druid/pages/usage-guide/resources-and-storage.adoc b/docs/modules/druid/pages/usage-guide/resources-and-storage.adoc
@@ -1,4 +1,5 @@
 = Storage and resource configuration
+:description: Configure storage and resource requests for Druid with default settings for CPU, memory, and additional settings for historical segment caches.
 
 == Storage for data volumes
 

diff --git a/docs/modules/druid/pages/usage-guide/security.adoc b/docs/modules/druid/pages/usage-guide/security.adoc
@@ -1,4 +1,5 @@
 = Security
+:description: Secure your Druid cluster with TLS encryption, LDAP, or OIDC authentication. Connect with OPA for policy-based authorization.
 
 The Druid cluster can be secured and protected in multiple ways.