Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

202402 metrics #74

Merged
merged 4 commits into from
Mar 1, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
328 changes: 328 additions & 0 deletions notebooks/05-operations/03-druid-metrics.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,328 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "c9fc0614-4eec-4c39-ad22-ea10afa00e9d",
"metadata": {},
"source": [
"# Apache Druid metrics\n",
"\n",
"Metrics give you insight into why your Druid instance is performing in the way that it is.\n",
"\n",
"In this notebook you will take a tour of the out-of-the-box configurations for metrics in Apache Druid, and use some simple terminal commands to inspect them."
]
},
{
"cell_type": "markdown",
"id": "cd9a55db-61ff-4fa4-9655-2fef292ed2aa",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"This tutorial works with Druid 29.0.0 or later, and is designed to be run on a Mac.\n",
"\n",
"Launch this tutorial and all prerequisites using the `jupyter` profile of the Docker Compose file for Jupyter-based Druid tutorials. For more information, see the Learn Druid repository [readme](https://github.com/implydata/learn-druid).\n",
"\n",
"__DO NOT__ use the `jupyter-druid` profile with this tutorial as it will conflict with your locally running copy."
]
},
{
"cell_type": "markdown",
"id": "151ede96-348f-423c-adc7-0e23c6910b9e",
"metadata": {},
"source": [
"## Initialization\n",
"\n",
"To use this notebook, you must have Druid running locally.\n",
"\n",
"You will also make extensive use of the terminal, which you can place alongside this notebook or on another screen.\n",
"\n",
"### Install required tools\n",
"\n",
"Open a local terminal window.\n",
"\n",
"If you haven't installed `wget` or `multitail` yet, run the following commands to install them using `brew`.\n",
"\n",
"```bash\n",
"brew install multitail ; brew install wget\n",
"```\n",
"\n",
"To fetch the default configuration for `multitail` to your home folder, execute the following command. Skip this step if you are already running `multitail` as it will overwrite your own configuration.\n",
"\n",
"```bash\n",
"curl https://raw.githubusercontent.com/halturin/multitail/master/multitail.conf > ~/.multitailrc\n",
"```\n",
"\n",
"### Install Apache Druid\n",
"\n",
"Run the following to create a dedicated folder for learn-druid in your home directory:\n",
"\n",
"```bash\n",
"cd ~ ; mkdir learn-druid-local\n",
"cd learn-druid-local\n",
"```\n",
"\n",
"Pull and extract a compatible version of Druid.\n",
"\n",
"```bash\n",
"wget https://dlcdn.apache.org/druid/29.0.0/apache-druid-29.0.0-bin.tar.gz\n",
"tar -xzf apache-druid-29.0.0-bin.tar.gz\n",
"```\n",
"\n",
"Use the following commands to rename the folder.\n",
"\n",
"```bash\n",
"mv apache-druid-29.0.0 apache-druid\n",
"cd apache-druid\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1d07feb7-92dd-42a9-ad6c-c89d5fdad8d5",
"metadata": {},
"source": [
"# Metrics configuration\n",
"\n",
"Metrics configuration is set in the `common.runtime.properties` file. This comprises:\n",
"\n",
"* [Monitors](https://druid.apache.org/docs/latest/configuration/index.html#metrics-monitors), which extend Druid's built-in metrics.\n",
"* [Emitters](https://druid.apache.org/docs/latest/configuration/index.html#metrics-emitters), which push the metrics to a destination location."
]
},
{
"cell_type": "markdown",
"id": "ca62693b-50f4-4831-a092-dbc8e06d8191",
"metadata": {},
"source": [
"## Emitters\n",
"\n",
"In this section you will amend the emitter configuration in order to take a look at the JSON objects that contain the data and description for each metric.\n",
"\n",
"Remembering that Druid has multiple configuration file locations out-of-the-box, run this command to view the `auto` configuration file for emitters:\n",
"\n",
"```bash\n",
"grep druid.emitter ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"Notice that, by default, the `druid.emitter` is configured to `noop`, meaning that [no metrics are emitted](https://druid.apache.org/docs/latest/configuration/#metrics-emitters).\n",
"\n",
"### Change the emitter to Logging\n",
"\n",
"Enable the emission of metrics from your instance to the [log files](https://druid.apache.org/docs/latest/configuration/#logging-emitter-module) by setting the `druid.emitter` to logging.\n",
"\n",
"Run the following command to update your configuration.\n",
"\n",
"```bash\n",
"sed -i '' 's/druid.emitter=noop/druid.emitter=logging/' \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"Additional log entries will be created containing the JSON data for each metric according to the [Logging emitter](https://druid.apache.org/docs/latest/configuration/#logging-emitter-module) configuration. This includes the `druid.emitter.logging.logLevel` of INFO for these entries.\n",
"\n",
"### Start a Druid instance\n",
"\n",
"Start Druid with the following command:\n",
"\n",
"```bash\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"### Look at the JSON metrics messages\n",
"\n",
"Run the following command to display the JSON being emitted to the log files.\n",
"\n",
"* `grep` finds only lines in log files related to metrics from the `LoggingEmitter`.\n",
"* The `cut` command returns only the 7th field in the data\n",
"* The result is made pretty through `jq`.\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/*.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq\n",
"```\n",
"\n",
"Run the command a few times to build up a good sample.\n",
"\n",
"You will see:\n",
"\n",
"* A timestamp for the event.\n",
"* The server hostname and type that emitted the metric together with its running version, such as \"druid/broker\" and \"29.0.0\".\n",
"* The metric name, such as \"serverview/init/time\".\n",
"* A value for the metric."
]
},
{
"cell_type": "markdown",
"id": "bec19737-2773-4c66-a05e-d0b1a707c4c8",
"metadata": {},
"source": [
"## Monitors\n",
"\n",
"Run this command to view the `auto` configuration file for for metrics monitors:\n",
"\n",
"```bash\n",
"grep druid.monitoring ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"### Inspect some metrics\n",
"\n",
"The default configuration for Druid extends the basic metrics with:\n",
"\n",
"* [JVM metrics](https://druid.apache.org/docs/latest/operations/metrics.html#jvm) from the `JvmMonitor` monitor.\n",
"* Service heartbeats from the `ServiceStatusMonitor` monitor.\n",
"\n",
"Use the command below to see a specific JVM metric for your Coordinator process. You may want to run this command a few times to see what is happening.\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/*.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq 'select(.metric == \"jvm/pool/used\" and .service==\"druid/coordinator\")'\n",
"```\n",
"\n",
"Notice that this metric has additional dimensions, `poolKind` and `poolName`. Other monitors emit [other dimensions](https://druid.apache.org/docs/latest/operations/metrics).\n",
"\n",
"Run the following command to return your entire instance to the basic metrics for Druid:\n",
"\n",
"```bash\n",
"sed -i '' 's/\"org.apache.druid.java.util.metrics.JvmMonitor\", \"org.apache.druid.server.metrics.ServiceStatusMonitor\"//' \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/_common/common.runtime.properties\n",
"```\n",
"\n",
"Now restart your instance and - for the purpose of this exercise - clear down your logs.\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"rm ~/learn-druid-local/apache-druid/log/*.log\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"Run this command to see the base metrics that are now being emitted:\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/*.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq\n",
"```\n",
"\n",
"### Add a process-specific monitor\n",
"\n",
"Some monitors are designed to work on specific processes. Enabling monitors on unsupported processes will cause that process to fail during startup. In this section you will add the [Historical](https://druid.apache.org/docs/latest/operations/metrics/#historical-1) monitor.\n",
"\n",
"First, stop your cluster with the following command:\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"```\n",
"\n",
"Now run this command to add the Historical monitor to your Historical process's `runtime.properties`.\n",
"\n",
"```bash\n",
"echo \"druid.monitoring.monitors=[\\\"org.apache.druid.server.metrics.HistoricalMetricsMonitor\\\"]\" >> \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/historical/runtime.properties\n",
"```\n",
"\n",
"Start Druid with the following command:\n",
"\n",
"```bash\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"Now run this command to review some of the metrics data from the Historical:\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/historical.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq\n",
"```\n",
"\n",
"### Increase the emission period\n",
"\n",
"There is a default [emission period](https://druid.apache.org/docs/latest/configuration/#enabling-metrics) of 1 minute. Apply a `druid.monitoring.emissionPeriod` to your configuration to have metrics emitted at a different rate.\n",
"\n",
"Run this command to have the Historical process emit metrics every 15 seconds:\n",
"\n",
"```bash\n",
"echo \"druid.monitoring.emissionPeriod=PT15S\" >> \\\n",
" ~/learn-druid-local/apache-druid/conf/druid/auto/historical/runtime.properties\n",
"```\n",
"\n",
"To apply the configuration, restart your instance:\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"rm ~/learn-druid-local/apache-druid/log/*.log\n",
"nohup ~/learn-druid-local/apache-druid/bin/start-druid & disown > log.out 2> log.err < /dev/null\n",
"```\n",
"\n",
"Run this command to see the metrics as they are being emitted. For ease of reading, the `jq` portion of this command only selects the timestamp, metric name, and its value.\n",
"\n",
"```bash\n",
"grep 'org.apache.druid.java.util.emitter.core.LoggingEmitter - \\[metrics\\]' ~/learn-druid-local/apache-druid/log/historical.log \\\n",
" | cut -d ' ' -f 7- \\\n",
" | jq '\"\\(.timestamp) \\(.metric) \\(.value)\"'\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "1023e5a3-cedc-435f-a448-30f8c3ae1535",
"metadata": {},
"source": [
"# Clean up\n",
"\n",
"Run this command to stop Druid.\n",
"\n",
"```bash\n",
"kill $(ps -ef | grep 'supervise' | awk 'NF{print $2}' | head -n 1)\n",
"```\n",
"\n",
"Delete the `learn-druid-local` folder from your home folder in the usual way."
]
},
{
"cell_type": "markdown",
"id": "57d5921b-636e-4a85-ae4d-dc0239bfef0b",
"metadata": {},
"source": [
"## Learn more\n",
"\n",
"You've seen how the two components of Druid's configuration for metrics are controlled - through monitors and through emitters - and that you can configure these either at the cluster level or for individual processes.\n",
"\n",
"* Read about [monitors](https://druid.apache.org/docs/latest/configuration/index.html#metrics-monitors) and [emitters](https://druid.apache.org/docs/latest/configuration/index.html#metrics-emitters) in the official documentation.\n",
"* Try out all the other monitors that are available, remembering that some monitors are only applicable to specific processes, requiring you to modify the `runtime.properties` for those processes only.\n",
"* Try out some of the [core emitters](https://druid.apache.org/docs/latest/configuration/#metrics-emitters) that are available as well as those available as [community extensions](https://druid.apache.org/docs/latest/configuration/extensions/#community-extensions), such as the [Apache Kafka](https://druid.apache.org/docs/latest/development/extensions-contrib/kafka-emitter), [statsd](https://druid.apache.org/docs/latest/development/extensions-contrib/statsd), and [Prometheus](https://druid.apache.org/docs/latest/development/extensions-contrib/prometheus) emitters.\n",
"* Experiment by using the Kafka emitter to push your instance's own metrics into a topic that you then consume back into the cluster with real-time ingestion."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a0f5f8da-a874-451f-9cf0-771202142fcf",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}