Small improvements on the azure module (#13859)

* Refactoring and improvements * Work on documentation * Add validation on config * Update changelog file * fix double records * Work on documentation * Work on review feedback * fix comment * remove "deprecated" condition * Small changes * Removing Fetch * Fix dimension name
elastic · Oct 9, 2019 · 1fb41ea · 1fb41ea
1 parent 46e4995
commit 1fb41ea
Show file tree

Hide file tree

Showing 19 changed files with 375 additions and 168 deletions.
diff --git a/CHANGELOG.next.asciidoc b/CHANGELOG.next.asciidoc
@@ -400,6 +400,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
 - Add support for NATS version 2. {pull}13601[13601]
 - Add `docker.cpu.*.norm.pct` metrics for `cpu` metricset of Docker Metricbeat module. {pull}13695[13695]
 - Add `instance` label by default when using Prometheus collector. {pull}13737[13737]
+- Add azure module. {pull}13196[13196] {pull}13859[13859]
 - Add Apache Tomcat module {pull}13491[13491]
 - Add ECS `container.id` and `container.runtime` to kubernetes `state_container` metricset. {pull}13884[13884]
 - Add `job` label by default when using Prometheus collector. {pull}13878[13878]

diff --git a/metricbeat/docs/modules/azure.asciidoc b/metricbeat/docs/modules/azure.asciidoc
@@ -10,6 +10,64 @@ beta[]
 
 This is the azure module.
 
+The Azure Monitor feature collects and aggregates logs and metrics from a variety of sources into a common data platform where it can be used for analysis, visualization, and alerting.
+
+
+The azure monitor metrics are numerical values that describe some aspect of a system at a particular point in time. They are collected at regular intervals and are identified with a timestamp, a name, a value, and one or more defining labels.
+
+The azure module will periodically retrieve the azure monitor metrics using the Azure REST APIs as MetricList.
+Additional azure API calls will be executed in order to retrieve information regarding the resources targeted by the user.
+
+The azure module mericsets are `monitor`,  `compute_vm` and `compute_vm_scaleset`
+
+[float]
+=== Module-specific configuration notes
+
+All the tasks executed against the Azure Monitor REST API will use the Azure Resource Manager authentication model.
+Therefore, all requests must be authenticated with Azure Active Directory (Azure AD).
+One approach to authenticate the client application is to create an Azure AD service principal and retrieve the authentication (JWT) token.
+For a more detailed walk-through, have a look at using Azure PowerShell to create a service principal to access resources https://docs.microsoft.com/en-us/powershell/azure/create-azure-service-principal-azureps?view=azps-2.7.0.
+ It is also possible to create a service principal via the Azure portal https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal.
+Users will have to make sure the roles assigned to the application contain at least reading permissions to the monitor data, more on the roles here https://docs.microsoft.com/en-us/azure/role-based-access-control/built-in-roles.
+
+Required credentials for the `azure` module:
+
+`client_id`:: The unique identifier for the application (also known as Application Id)
+
+`client_secret`:: The client/application secret/key
+
+`subscription_id`:: The unique identifier for the azure subscription
+
+`tenant_id`:: The unique identifier of the Azure Active Directory instance
+
+
+Users can use the azure credentials keys if configured `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID`
+
+[float]
+== Metricsets
+
+[float]
+=== `monitor`
+This metricset allows users to retrieve metrics from specified resources. Added filters can apply here as the interval of retrieving these metrics, metric names,
+aggregation list, namespaces and metric dimensions.
+
+[float]
+=== `compute_vm`
+This metricset will collect metrics from the virtual machines, these metrics will have a timegrain every 5 minutes,
+so the `period` for `compute_vm` metricset  should be `300s` or multiples of `300s`.
+
+[float]
+=== `compute_vm_scaleset`
+This metricset will collect metrics from the virtual machine scalesets, these metrics will have a timegrain every 5 minutes,
+so the `period` for `compute_vm_scaleset` metricset  should be `300s` or multiples of `300s`.
+
+
+[float]
+== Additional notes about metrics and costs
+
+Costs: Metric queries are charged based on the number of standard API calls. More information on pricing here https://azure.microsoft.com/id-id/pricing/details/monitor/.
+
+Authentication: we are handling authentication on our side (creating/renewing the authentication token), so we advise users to use dedicated credentials for metricbeat only.
 
 
 [float]

diff --git a/x-pack/metricbeat/module/azure/_meta/config.yml b/x-pack/metricbeat/module/azure/_meta/config.yml
@@ -9,28 +9,7 @@
   subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
   refresh_list_interval: 600s
   resources:
-  - resource_id: "subscriptions/1234qwe-4b1e-1234-1234-123456qwert/resourceGroups/obs-infrastructure/providers/Microsoft.ApiManagement/service/apimanagement"
-    metrics:
-    - name: "Requests"
-      namespace: "Microsoft.ApiManagement/service"
-      aggregations: ["Maximum"]
-      timegrain: "PT1M"
-      dimensions:
-      - name: "Hostname"
-        value: "*"
-    - name: ["Capacity", "Requests"]
-      namespace: "Microsoft.ApiManagement/service"
-      aggregations: ["Average"]
-      dimensions:
-      - name: "Location"
-        value: "West Europe"
-#  - resource_group: "testresourcegroup"
-#    resource_type: "Microsoft.Compute/virtualMachines"
-#    metrics:
-#    - name: "*"
-#      namespace: "Microsoft.Compute/virtualMachines"
-#      timegrain: "PT1M"
-#  - resource_query: "resourceType eq 'Microsoft.DocumentDb/databaseAccounts' and name eq 'databaseAccount'"
+#  - resource_query: "resourceType eq 'Microsoft.DocumentDb/databaseAccounts'"
 #    metrics:
 #    - name: ["DataUsage", "DocumentCount", "DocumentQuota"]
 #      namespace: "Microsoft.DocumentDb/databaseAccounts"

diff --git a/x-pack/metricbeat/module/azure/_meta/docs.asciidoc b/x-pack/metricbeat/module/azure/_meta/docs.asciidoc
@@ -1,2 +1,60 @@
 This is the azure module.
 
+The Azure Monitor feature collects and aggregates logs and metrics from a variety of sources into a common data platform where it can be used for analysis, visualization, and alerting.
+
+
+The azure monitor metrics are numerical values that describe some aspect of a system at a particular point in time. They are collected at regular intervals and are identified with a timestamp, a name, a value, and one or more defining labels.
+
+The azure module will periodically retrieve the azure monitor metrics using the Azure REST APIs as MetricList.
+Additional azure API calls will be executed in order to retrieve information regarding the resources targeted by the user.
+
+The azure module mericsets are `monitor`,  `compute_vm` and `compute_vm_scaleset`
+
+[float]
+=== Module-specific configuration notes
+
+All the tasks executed against the Azure Monitor REST API will use the Azure Resource Manager authentication model.
+Therefore, all requests must be authenticated with Azure Active Directory (Azure AD).
+One approach to authenticate the client application is to create an Azure AD service principal and retrieve the authentication (JWT) token.
+For a more detailed walk-through, have a look at using Azure PowerShell to create a service principal to access resources https://docs.microsoft.com/en-us/powershell/azure/create-azure-service-principal-azureps?view=azps-2.7.0.
+ It is also possible to create a service principal via the Azure portal https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal.
+Users will have to make sure the roles assigned to the application contain at least reading permissions to the monitor data, more on the roles here https://docs.microsoft.com/en-us/azure/role-based-access-control/built-in-roles.
+
+Required credentials for the `azure` module:
+
+`client_id`:: The unique identifier for the application (also known as Application Id)
+
+`client_secret`:: The client/application secret/key
+
+`subscription_id`:: The unique identifier for the azure subscription
+
+`tenant_id`:: The unique identifier of the Azure Active Directory instance
+
+
+Users can use the azure credentials keys if configured `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_TENANT_ID`, `AZURE_SUBSCRIPTION_ID`
+
+[float]
+== Metricsets
+
+[float]
+=== `monitor`
+This metricset allows users to retrieve metrics from specified resources. Added filters can apply here as the interval of retrieving these metrics, metric names,
+aggregation list, namespaces and metric dimensions.
+
+[float]
+=== `compute_vm`
+This metricset will collect metrics from the virtual machines, these metrics will have a timegrain every 5 minutes,
+so the `period` for `compute_vm` metricset  should be `300s` or multiples of `300s`.
+
+[float]
+=== `compute_vm_scaleset`
+This metricset will collect metrics from the virtual machine scalesets, these metrics will have a timegrain every 5 minutes,
+so the `period` for `compute_vm_scaleset` metricset  should be `300s` or multiples of `300s`.
+
+
+[float]
+== Additional notes about metrics and costs
+
+Costs: Metric queries are charged based on the number of standard API calls. More information on pricing here https://azure.microsoft.com/id-id/pricing/details/monitor/.
+
+Authentication: we are handling authentication on our side (creating/renewing the authentication token), so we advise users to use dedicated credentials for metricbeat only.
diff --git a/x-pack/metricbeat/module/azure/_meta/shared-azure.asciidoc b/x-pack/metricbeat/module/azure/_meta/shared-azure.asciidoc
@@ -0,0 +1,9 @@
+[float]
+=== Metricset-specific configuration notes
+
+`refresh_list_interval`:: Resources will be retrieved at each fetch call (`period` interval), this means a number of Azure REST calls will be executed each time.
+This will be helpful if the azure users will be adding/removing resources that could match the configuration options so they will not added/removed to the list.
+To reduce on the number of API calls we are executing to retrieve the resources each time, users can configure this setting and make sure the list or resources will not be refreshed as often.
+This is also beneficial for performance and rate/ cost reasons (https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits).
+
+`resources` :: This will contain all options for identifying resources and configuring the desired metrics
diff --git a/x-pack/metricbeat/module/azure/azure.go b/x-pack/metricbeat/module/azure/azure.go
@@ -7,6 +7,8 @@ package azure
 import (
 	"time"
 
+	"github.com/elastic/beats/libbeat/common/cfgwarn"
+
 	"github.com/pkg/errors"
 
 	"github.com/elastic/beats/metricbeat/mb"
@@ -63,3 +65,66 @@ func newModule(base mb.BaseModule) (mb.Module, error) {
 	}
 	return &base, nil
 }
+
+// MetricSet holds any configuration or state information. It must implement
+// the mb.MetricSet interface. And this is best achieved by embedding
+// mb.BaseMetricSet because it implements all of the required mb.MetricSet
+// interface methods except for Fetch.
+type MetricSet struct {
+	mb.BaseMetricSet
+	Client    *Client
+	MapMetric mapMetric
+}
+
+// NewMetricSet will instantiate a new azure metricset
+func NewMetricSet(base mb.BaseMetricSet) (*MetricSet, error) {
+	metricsetName := base.Name()
+	cfgwarn.Beta("The azure %s metricset is beta.", metricsetName)
+	var config Config
+	err := base.Module().UnpackConfig(&config)
+	if err != nil {
+		return nil, errors.Wrap(err, "error unpack raw module config using UnpackConfig")
+	}
+
+	//validate config based on metricset
+	switch metricsetName {
+	case nativeMetricset:
+		// resources must be configured for the monitor metricset
+		if len(config.Resources) == 0 {
+			return nil, errors.Errorf("no resource options defined: module azure - %s metricset", metricsetName)
+		}
+	default:
+		// validate config resource options entered, no resource queries allowed for the compute_vm and compute_vm_scaleset metricsets
+		for _, resource := range config.Resources {
+			if resource.Query != "" {
+				return nil, errors.Errorf("error initializing the monitor client: module azure - %s metricset. No queries allowed, please select one of the allowed options", metricsetName)
+			}
+		}
+
+	}
+	// instantiate monitor client
+	monitorClient, err := NewClient(config)
+	if err != nil {
+		return nil, errors.Wrapf(err, "error initializing the monitor client: module azure - %s metricset", metricsetName)
+	}
+	return &MetricSet{
+		BaseMetricSet: base,
+		Client:        monitorClient,
+	}, nil
+}
+
+// Fetch methods implements the data gathering and data conversion to the right metricset
+// It publishes the event which is then forwarded to the output. In case
+// of an error set the Error field of mb.Event or simply call report.Error().
+func (m *MetricSet) Fetch(report mb.ReporterV2) error {
+	err := m.Client.InitResources(m.MapMetric, report)
+	if err != nil {
+		return err
+	}
+	// retrieve metrics
+	err = m.Client.GetMetricValues(report)
+	if err != nil {
+		return err
+	}
+	return EventsMapping(report, m.Client.Resources.Metrics, m.BaseMetricSet.Name())
+}
diff --git a/x-pack/metricbeat/module/azure/client_utils.go b/x-pack/metricbeat/module/azure/client_utils.go
@@ -64,8 +64,13 @@ func mapMetricValues(metrics []insights.Metric, previousMetrics []MetricValue, s
 // metricExists will check if the metric value has been retrieved in the past
 func metricExists(name string, metric insights.MetricValue, metrics []MetricValue) bool {
 	for _, met := range metrics {
-		if name == met.name && metric.TimeStamp.Time == met.timestamp && metric.Average == met.avg && metric.Total == met.total && metric.Minimum == met.min &&
-			metric.Maximum == met.max && metric.Count == met.count {
+		if name == met.name &&
+			metric.TimeStamp.Equal(met.timestamp) &&
+			compareMetricValues(met.avg, metric.Average) &&
+			compareMetricValues(met.total, metric.Total) &&
+			compareMetricValues(met.max, metric.Maximum) &&
+			compareMetricValues(met.min, metric.Minimum) &&
+			compareMetricValues(met.count, metric.Count) {
 			return true
 		}
 	}
@@ -132,3 +137,17 @@ func mapTags(azureTags map[string]*string) map[string]string {
 	}
 	return tags
 }
+
+// compareMetricValues will compare 2 pointer values
+func compareMetricValues(metVal *float64, metricVal *float64) bool {
+	if metVal == nil && metricVal == nil {
+		return true
+	}
+	if metVal == nil || metricVal == nil {
+		return false
+	}
+	if *metVal == *metricVal {
+		return true
+	}
+	return false
+}
diff --git a/x-pack/metricbeat/module/azure/client_utils_test.go b/x-pack/metricbeat/module/azure/client_utils_test.go
@@ -128,3 +128,17 @@ func TestExpired(t *testing.T) {
 	result := resConfig.Expired()
 	assert.True(t, result)
 }
+
+func TestCompareMetricValues(t *testing.T) {
+	var val1 *float64
+	var val2 *float64
+	result := compareMetricValues(val1, val2)
+	assert.True(t, result)
+	float1 := 1.23
+	val1 = &float1
+	result = compareMetricValues(val1, val2)
+	assert.False(t, result)
+	val2 = &float1
+	result = compareMetricValues(val1, val2)
+	assert.True(t, result)
+}
diff --git a/x-pack/metricbeat/module/azure/compute_vm/_meta/docs.asciidoc b/x-pack/metricbeat/module/azure/compute_vm/_meta/docs.asciidoc
@@ -1 +1,19 @@
 This is the compute_vm metricset of the module azure.
+
+This metricset allows users to retrieve all metrics from specified virtual machines.
+
+include::../../_meta/shared-azure.asciidoc[]
+
+[float]
+==== Config options to identify resources
+
+`resource_id`:: (_[]string_) The fully qualified ID's of the resource, including the resource name and resource type. Has the format /subscriptions/{guid}/resourceGroups/{resource-group-name}/providers/{resource-provider-namespace}/{resource-type}/{resource-name}.
+  Should return a list of resources.
+
+`resource_group`:: (_[]string_) This option should return a list virtual machines we want to apply our metric configuration options on.
+
+If none of the options are entered then we will select all virtual machine from the entire subscription
+For each metric the primary aggregation assigned will be retrieved.
+A default non configurable timegrain of 5 min is set so users are advised to configure an interval of 300s or  a multiply of it.
+
+
diff --git a/x-pack/metricbeat/module/azure/compute_vm/client_helper.go b/x-pack/metricbeat/module/azure/compute_vm/client_helper.go
@@ -5,6 +5,7 @@
 package compute_vm
 
 import (
+	"github.com/Azure/azure-sdk-for-go/services/preview/monitor/mgmt/2019-06-01/insights"
 	"github.com/Azure/azure-sdk-for-go/services/resources/mgmt/2019-03-01/resources"
 	"github.com/pkg/errors"
 
@@ -32,8 +33,12 @@ func mapMetric(client *azure.Client, metric azure.MetricConfig, resource resourc
 		if len(*metricDefinitions.Value) == 0 {
 			return nil, errors.Errorf("no metric definitions were found for resource %s and namespace %s.", *resource.ID, *namespace.Properties.MetricNamespaceName)
 		}
+		var filteredMetricDefinitions []insights.MetricDefinition
+		for _, metricDefinition := range *metricDefinitions.Value {
+			filteredMetricDefinitions = append(filteredMetricDefinitions, metricDefinition)
+		}
 		// map azure metric definitions to client metrics
-		metrics = append(metrics, azure.MapMetricByPrimaryAggregation(client, *metricDefinitions.Value, resource, *namespace.Properties.MetricNamespaceName, nil, azure.DefaultTimeGrain)...)
+		metrics = append(metrics, azure.MapMetricByPrimaryAggregation(client, filteredMetricDefinitions, resource, *namespace.Properties.MetricNamespaceName, nil, azure.DefaultTimeGrain)...)
 	}
 	return metrics, nil
 }