Skip to content

Commit

Permalink
Merge pull request #74 from implydata/202402-metrics
Browse files Browse the repository at this point in the history
202402 metrics
  • Loading branch information
petermarshallio authored Mar 1, 2024
2 parents 6f655b2 + 99db02c commit d78a480
Showing 1 changed file with 328 additions and 0 deletions.
328 changes: 328 additions & 0 deletions notebooks/05-operations/03-druid-metrics.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,328 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "c9fc0614-4eec-4c39-ad22-ea10afa00e9d",
"metadata": {},
"source": [
"# Apache Druid metrics\n",
"\n",
"Metrics give you insight into why your Druid instance is performing in the way that it is.\n",
"\n",
"In this notebook you will take a tour of the out-of-the-box configurations for metrics in Apache Druid, and use some simple terminal commands to inspect them."
]
},
{
"cell_type": "markdown",
"id": "cd9a55db-61ff-4fa4-9655-2fef292ed2aa",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"This tutorial works with Druid 29.0.0 or later, and is designed to be run on a Mac.\n",
"\n",
"Launch this tutorial and all prerequisites using the `jupyter` profile of the Docker Compose file for Jupyter-based Druid tutorials. For more information, see the Learn Druid repository [readme](https://github.com/implydata/learn-druid).\n",
"\n",
"__DO NOT__ use the `jupyter-druid` profile with this tutorial as it will conflict with your locally running copy."
]
},
{
"cell_type": "markdown",
"id": "151ede96-348f-423c-adc7-0e23c6910b9e",
"metadata": {},
"source": [
"## Initialization\n",
"\n",
"To use this notebook, you must have Druid running locally.\n",
"\n",
"You will also make extensive use of the terminal, which you can place alongside this notebook or on another screen.\n",
"\n",
"### Install required tools\n",
"\n",
"Open a local terminal window.\n",
"\n",
"If you haven't installed `wget` or `multitail` yet, run the following commands to install them using `brew`.\n",
"\n",
"```bash\n",
"brew install multitail ; brew install wget\n",
"```\n",
"\n",
"To fetch the default configuration for `multitail` to your home folder, execute the following command. Skip this step if you are already running `multitail` as it will overwrite your own configuration.\n",
"\n",
"```bash\n",
"curl https://raw.githubusercontent.com/halturin/multitail/master/multitail.conf > ~/.multitailrc\n",
"```\n",
"\n",
"### Install Apache Druid\n",
"\n",
"Run the following to create a dedicated folder for learn-druid in your home directory:\n",
"\n",
"```bash\n",
"cd ~ ; mkdir learn-druid-local\n",
"cd learn-druid-local\n",
"```\n",
"\n",
"Pull and extract a compatible version of Druid.\n",
"\n",
"```bash\n",
"wget https://dlcdn.apache.org/druid/29.0.0/apache-druid-29.0.0-bin.tar.gz\n",
"tar -xzf apache-druid-29.0.0-bin.tar.gz\n",
"```\n",
"\n",
"Use the following commands to rename the folder.\n",
"\n",
"```bash\n",
"mv apache-druid-29.0.0 apache-druid\n",
"cd apache-druid\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1d07feb7-92dd-42a9-ad6c-c89d5fdad8d5",
"metadata": {},
"source": [
"# Metrics configuration\n",
"\n",
"Metrics configuration is set in the `common.runtime.properties` file. This comprises:\n",
"\n",
"* [Monitors](https://druid.apache.org/docs/latest/configuration/index.html#metrics-monitors), which extend Druid's built-in metrics.\n",
"* [Emitters](https://druid.apache.org/docs/latest/configuration/index.html#metrics-emitters), which push the metrics to a destination location."
]
},
{
"cell_type": "markdown",
"id": "ca62693b-50f4-4831-a092-dbc8e06d8191",
"metadata": {},
"source": [
"## Emitters\n",
"\n",
"In this section you will amend the emitter configuration in order to take a look at the JSON objects that contain the data and description for each metric.\n",
"\n",
"Remembering that Druid has multiple configuration file locations out-of-the-box, run this command to view the `auto` configuration file for emitters:\n",
"\n",
"```bash\n",
"grep druid.emitter ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"Notice that, by default, the `druid.emitter` is configured to `noop`, meaning that [no metrics are emitted](https://druid.apache.org/docs/latest/configuration/#metrics-emitters).\n",
"\n",
"### Change the emitter to Logging\n",
"\n",
"Enable the emission of metrics from your instance to the [log files](https://druid.apache.org/docs/latest/configuration/#logging-emitter-module) by setting the `druid.emitter` to logging.\n",
"\n",
"Run the following command to update your configuration.\n",
"\n",
"```bash\n",
"sed -i '' 's/druid.emitter=noop/druid.emitter=logging/' \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"Additional log entries will be created containing the JSON data for each metric according to the [Logging emitter](https://druid.apache.org/docs/latest/configuration/#logging-emitter-module) configuration. This includes the `druid.emitter.logging.logLevel` of INFO for these entries.\n",
"\n",
"### Start a Druid instance\n",
"\n",
"Start Druid with the following command:\n",
"\n",
"```bash\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"### Look at the JSON metrics messages\n",
"\n",
"Run the following command to display the JSON being emitted to the log files.\n",
"\n",
"* `grep` finds only lines in log files related to metrics from the `LoggingEmitter`.\n",
"* The `cut` command returns only the 7th field in the data\n",
"* The result is made pretty through `jq`.\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/*.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq\n",
"```\n",
"\n",
"Run the command a few times to build up a good sample.\n",
"\n",
"You will see:\n",
"\n",
"* A timestamp for the event.\n",
"* The server hostname and type that emitted the metric together with its running version, such as \"druid/broker\" and \"29.0.0\".\n",
"* The metric name, such as \"serverview/init/time\".\n",
"* A value for the metric."
]
},
{
"cell_type": "markdown",
"id": "bec19737-2773-4c66-a05e-d0b1a707c4c8",
"metadata": {},
"source": [
"## Monitors\n",
"\n",
"Run this command to view the `auto` configuration file for for metrics monitors:\n",
"\n",
"```bash\n",
"grep druid.monitoring ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"### Inspect some metrics\n",
"\n",
"The default configuration for Druid extends the basic metrics with:\n",
"\n",
"* [JVM metrics](https://druid.apache.org/docs/latest/operations/metrics.html#jvm) from the `JvmMonitor` monitor.\n",
"* Service heartbeats from the `ServiceStatusMonitor` monitor.\n",
"\n",
"Use the command below to see a specific JVM metric for your Coordinator process. You may want to run this command a few times to see what is happening.\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/*.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq 'select(.metric == \"jvm/pool/used\" and .service==\"druid/coordinator\")'\n",
"```\n",
"\n",
"Notice that this metric has additional dimensions, `poolKind` and `poolName`. Other monitors emit [other dimensions](https://druid.apache.org/docs/latest/operations/metrics).\n",
"\n",
"Run the following command to return your entire instance to the basic metrics for Druid:\n",
"\n",
"```bash\n",
"sed -i '' 's/\"org.apache.druid.java.util.metrics.JvmMonitor\", \"org.apache.druid.server.metrics.ServiceStatusMonitor\"//' \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"Now restart your instance and - for the purpose of this exercise - clear down your logs.\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"rm ~/learn-druid-local/apache-druid/log/*.log\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"Run this command to see the base metrics that are now being emitted:\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/*.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq\n",
"```\n",
"\n",
"### Add a process-specific monitor\n",
"\n",
"Some monitors are designed to work on specific processes. Enabling monitors on unsupported processes will cause that process to fail during startup. In this section you will add the [Historical](https://druid.apache.org/docs/latest/operations/metrics/#historical-1) monitor.\n",
"\n",
"First, stop your cluster with the following command:\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"```\n",
"\n",
"Now run this command to add the Historical monitor to your Historical process's `runtime.properties`.\n",
"\n",
"```bash\n",
"echo \"druid.monitoring.monitors=[\\\"org.apache.druid.server.metrics.HistoricalMetricsMonitor\\\"]\" >> \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/historical/runtime.properties\n",
"```\n",
"\n",
"Start Druid with the following command:\n",
"\n",
"```bash\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"Now run this command to review some of the metrics data from the Historical:\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/historical.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq\n",
"```\n",
"\n",
"### Increase the emission period\n",
"\n",
"There is a default [emission period](https://druid.apache.org/docs/latest/configuration/#enabling-metrics) of 1 minute. Apply a `druid.monitoring.emissionPeriod` to your configuration to have metrics emitted at a different rate.\n",
"\n",
"Run this command to have the Historical process emit metrics every 15 seconds:\n",
"\n",
"```bash\n",
"echo \"druid.monitoring.emissionPeriod=PT15S\" >> \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/historical/runtime.properties\n",
"```\n",
"\n",
"To apply the configuration, restart your instance:\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"rm ~/learn-druid-local/apache-druid/log/*.log\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"Run this command to see the metrics as they are being emitted. For ease of reading, the `jq` portion of this command only selects the timestamp, metric name, and its value.\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/historical.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq '\"\\(.timestamp) \\(.metric) \\(.value)\"'\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1023e5a3-cedc-435f-a448-30f8c3ae1535",
"metadata": {},
"source": [
"# Clean up\n",
"\n",
"Run this command to stop Druid.\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"```\n",
"\n",
"Delete the `learn-druid-local` folder from your home folder in the usual way."
]
},
{
"cell_type": "markdown",
"id": "57d5921b-636e-4a85-ae4d-dc0239bfef0b",
"metadata": {},
"source": [
"## Learn more\n",
"\n",
"You've seen how the two components of Druid's configuration for metrics are controlled - through monitors and through emitters - and that you can configure these either at the cluster level or for individual processes.\n",
"\n",
"* Read about [monitors](https://druid.apache.org/docs/latest/configuration/index.html#metrics-monitors) and [emitters](https://druid.apache.org/docs/latest/configuration/index.html#metrics-emitters) in the official documentation.\n",
"* Try out all the other monitors that are available, remembering that some monitors are only applicable to specific processes, requiring you to modify the `runtime.properties` for those processes only.\n",
"* Try out some of the [core emitters](https://druid.apache.org/docs/latest/configuration/#metrics-emitters) that are available as well as those available as [community extensions](https://druid.apache.org/docs/latest/configuration/extensions/#community-extensions), such as the [Apache Kafka](https://druid.apache.org/docs/latest/development/extensions-contrib/kafka-emitter), [statsd](https://druid.apache.org/docs/latest/development/extensions-contrib/statsd), and [Prometheus](https://druid.apache.org/docs/latest/development/extensions-contrib/prometheus) emitters.\n",
"* Experiment by using the Kafka emitter to push your instance's own metrics into a topic that you then consume back into the cluster with real-time ingestion."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a0f5f8da-a874-451f-9cf0-771202142fcf",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

0 comments on commit d78a480

Please sign in to comment.