Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics datastream for Azure Functions #7130

Merged
merged 19 commits into from
Aug 25, 2023

Conversation

devamanv
Copy link
Contributor

@devamanv devamanv commented Jul 24, 2023

What does this PR do?

The PR contains changes to add metrics datastream for Azure Functions.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

  • Clone the integrations repo
  • Install elastic-package locally
  • Spin up an elastic stack using elastic-package
  • Run elastic-package test from the integrations/packages/azure_functions directory

Related issues

Implements a part of #6704

Screenshots

image
image

@devamanv devamanv self-assigned this Jul 24, 2023
@devamanv devamanv linked an issue Jul 24, 2023 that may be closed by this pull request
2 tasks
@elasticmachine
Copy link

elasticmachine commented Jul 24, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-08-25T09:03:10.081+0000

  • Duration: 14 min 28 sec

Test stats 🧪

Test Results
Failed 0
Passed 10
Skipped 0
Total 10

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Jul 24, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (2/2) 💚
Classes 100.0% (2/2) 💚
Methods 88.235% (15/17) 👎 -11.765
Lines 96.753% (149/154) 👎 -3.247
Conditionals 100.0% (0/0) 💚

@andrewkroh andrewkroh added the Integration:azure Azure Logs label Aug 1, 2023
Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass is good, there are a few details to discuss.

packages/azure_functions/changelog.yml Outdated Show resolved Hide resolved
packages/azure_functions/changelog.yml Show resolved Hide resolved
Comment on lines +10 to +48
- name: eventhub
type: text
title: Event Hub
multi: false
required: true
show_user: true
description: >-
Elastic recommends using one event hub for each integration. Visit [Create an event hub](https://docs.elastic.co/integrations/azure#create-an-event-hub) to learn more. Use event hub names up to 30 characters long to avoid compatibility issues.
- name: consumer_group
type: text
title: Consumer Group
multi: false
required: true
show_user: true
default: $Default
- name: connection_string
type: password
title: Connection String
multi: false
required: true
show_user: true
description: >-
The connection string required to communicate with Event Hubs. See [Get an Event Hubs connection string](https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-get-connection-string) to learn more.
- name: storage_account
type: text
title: Storage Account
multi: false
required: true
show_user: true
description: >-
The name of the storage account where the consumer group's state/offsets will be stored and updated.
- name: storage_account_key
type: password
title: Storage Account Key
multi: false
required: true
show_user: true
description: >-
The storage account key, this key will be used to authorize access to data in your storage account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are moving these vars from the "integration package scope" to the "data stream scope", we should test the upgrade path from 0.1.0 to the new version.

I know this is integration is experimental, but we should see what's the user experience in this use case.

Copy link
Contributor Author

@devamanv devamanv Aug 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to test the upgrade from 0.0.1 to a new version, including the installations of assets. I was able to ingest logs after the upgrade in ES. So, this should be good enough to go ahead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool.

Did the upgrade kept all the integration settings values like event hub name, connection string, etc. from the integration scope to data stream scope?

Comment on lines +1 to +183
- name: domain
level: extended
type: keyword
ignore_above: 1024
description: 'Name of the domain of which the host is a member.

For example, on Windows this could be the host''s Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host''s LDAP provider.'
example: CONTOSO
default_field: false
- name: hostname
level: core
type: keyword
ignore_above: 1024
description: 'Hostname of the host.

It normally contains what the `hostname` command returns on the host machine.'
- name: id
level: core
type: keyword
ignore_above: 1024
description: 'Unique host id.

As hostname is not always unique, use values that are meaningful in your environment.

Example: The current usage of `beat.name`.'
- name: ip
level: core
type: ip
description: Host ip addresses.
- name: mac
level: core
type: keyword
ignore_above: 1024
description: Host mac addresses.
- name: name
level: core
type: keyword
ignore_above: 1024
description: 'Name of the host.

It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use.'
- name: os.family
level: extended
type: keyword
ignore_above: 1024
description: OS family (such as redhat, debian, freebsd, windows).
example: debian
- name: os.kernel
level: extended
type: keyword
ignore_above: 1024
description: Operating system kernel version as a raw string.
example: 4.4.0-112-generic
- name: os.name
level: extended
type: keyword
ignore_above: 1024
multi_fields:
- name: text
type: text
norms: false
default_field: false
description: Operating system name, without the version.
example: Mac OS X
- name: os.platform
level: extended
type: keyword
ignore_above: 1024
description: Operating system platform (such centos, ubuntu, windows).
example: darwin
- name: os.version
level: extended
type: keyword
ignore_above: 1024
description: Operating system version as a raw string.
example: 10.14.1
- name: type
level: core
type: keyword
ignore_above: 1024
description: 'Type of host.

For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment.'
- name: containerized
type: boolean
description: >
If the host is a container.

- name: os.build
type: keyword
example: "18D109"
description: >
OS build information.

- name: os.codename
type: keyword
example: "stretch"
description: >
OS codename, if any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check if we can leverage ECS for these fields definition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch to the ECS definition for these fields? They look the same.

packages/azure_functions/manifest.yml Outdated Show resolved Hide resolved
@devamanv devamanv marked this pull request as ready for review August 9, 2023 08:17
@devamanv devamanv requested a review from a team as a code owner August 9, 2023 08:17
Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there!

Just a couple of double-checks and a nit:

  1. double-check: Did the upgrade kept all the integration settings values like event hub name, connection string, etc. from the integration scope to data stream scope?
  2. double-check: Can we use ECS for fields definitions at packages/azure_functions/data_stream/metrics/fields/fields.yml?
  3. nit: what about moving credentials at the top of the data stream settings?

packages/azure_functions/data_stream/metrics/manifest.yml Outdated Show resolved Hide resolved
Comment on lines +10 to +48
- name: eventhub
type: text
title: Event Hub
multi: false
required: true
show_user: true
description: >-
Elastic recommends using one event hub for each integration. Visit [Create an event hub](https://docs.elastic.co/integrations/azure#create-an-event-hub) to learn more. Use event hub names up to 30 characters long to avoid compatibility issues.
- name: consumer_group
type: text
title: Consumer Group
multi: false
required: true
show_user: true
default: $Default
- name: connection_string
type: password
title: Connection String
multi: false
required: true
show_user: true
description: >-
The connection string required to communicate with Event Hubs. See [Get an Event Hubs connection string](https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-get-connection-string) to learn more.
- name: storage_account
type: text
title: Storage Account
multi: false
required: true
show_user: true
description: >-
The name of the storage account where the consumer group's state/offsets will be stored and updated.
- name: storage_account_key
type: password
title: Storage Account Key
multi: false
required: true
show_user: true
description: >-
The storage account key, this key will be used to authorize access to data in your storage account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool.

Did the upgrade kept all the integration settings values like event hub name, connection string, etc. from the integration scope to data stream scope?

Comment on lines +1 to +183
- name: domain
level: extended
type: keyword
ignore_above: 1024
description: 'Name of the domain of which the host is a member.

For example, on Windows this could be the host''s Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host''s LDAP provider.'
example: CONTOSO
default_field: false
- name: hostname
level: core
type: keyword
ignore_above: 1024
description: 'Hostname of the host.

It normally contains what the `hostname` command returns on the host machine.'
- name: id
level: core
type: keyword
ignore_above: 1024
description: 'Unique host id.

As hostname is not always unique, use values that are meaningful in your environment.

Example: The current usage of `beat.name`.'
- name: ip
level: core
type: ip
description: Host ip addresses.
- name: mac
level: core
type: keyword
ignore_above: 1024
description: Host mac addresses.
- name: name
level: core
type: keyword
ignore_above: 1024
description: 'Name of the host.

It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use.'
- name: os.family
level: extended
type: keyword
ignore_above: 1024
description: OS family (such as redhat, debian, freebsd, windows).
example: debian
- name: os.kernel
level: extended
type: keyword
ignore_above: 1024
description: Operating system kernel version as a raw string.
example: 4.4.0-112-generic
- name: os.name
level: extended
type: keyword
ignore_above: 1024
multi_fields:
- name: text
type: text
norms: false
default_field: false
description: Operating system name, without the version.
example: Mac OS X
- name: os.platform
level: extended
type: keyword
ignore_above: 1024
description: Operating system platform (such centos, ubuntu, windows).
example: darwin
- name: os.version
level: extended
type: keyword
ignore_above: 1024
description: Operating system version as a raw string.
example: 10.14.1
- name: type
level: core
type: keyword
ignore_above: 1024
description: 'Type of host.

For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment.'
- name: containerized
type: boolean
description: >
If the host is a container.

- name: os.build
type: keyword
example: "18D109"
description: >
OS build information.

- name: os.codename
type: keyword
example: "stretch"
description: >
OS codename, if any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch to the ECS definition for these fields? They look the same.

@lalit-satapathy
Copy link
Collaborator

Can we upload a sample event to cross-check?

@zmoog
Copy link
Contributor

zmoog commented Aug 16, 2023

  1. double-check: Did the upgrade kept all the integration settings values like event hub name, connection string, etc. from the integration scope to data stream scope?

I ran a quick test setting up 0.0.1 with logs ingestion and then upgrading to 0.1.1. The upgrade triggers the fields conflict flow, and by default users must enter event hub and storage account details again.

CleanShot.2023-08-16.at.12.46.45.mp4

@zmoog
Copy link
Contributor

zmoog commented Aug 16, 2023

Conflict management like this is fine per se. It's the expected behavior when we remove or move the vars from one scope (global) to another (data stream).

@lalit-satapathy, can we somehow hint at the UI of picking the values for the old vars?

In general, how do we usually handle situations like this? We just let it be, or mention this in the docs? I never faced an integration conflict case before.

@lalit-satapathy
Copy link
Collaborator

In general, how do we usually handle situations like this? We just let it be, or mention this in the docs? I never faced an integration conflict case before.

Not sure, why this has happened. Was is trigged by the moving of the variables to the data stream scope #7130 (comment) ?

@zmoog, May be not much use of this 0.0.1 package currently? So, if the change in itself is good (Please confirm that), it will be less impact.

@zmoog
Copy link
Contributor

zmoog commented Aug 16, 2023

Not sure, why this has happened. Was is trigged by the moving of the variables to the data stream scope #7130 (comment) ?

Yep, with commit 843df5f we moved the vars from:

packages/azure_functions/manifest.yml (package scope)

Into:

packages/azure_functions/data_stream/functionapplogs/manifest.yml (integration / data stream scope)

The main reason for this change is the azure-eventhub input and the monitor metricset use different authentication mechanisms. We moved the azure-eventhub authentication options into the logs integration.

This problem will eventually go away when we upgrade the azure-eventhub input to the new Azure SDK, allowing us to use the same mechanism for both. Here's the tracking issue for the upgrade elastic/beats#33815. I hope we can work on this in the next platform team iteration.

May be not much use of this 0.0.1 package currently? So, if the change in itself is good (Please confirm that), it will be less impact.

The changes in this PR are good. The variables conflict management is working as expected, and since this is a technical preview, I believe it's OK to go through this flow.

If there are reasons to ship this version as we are ready, the other points in the previous comment are nits we can address later.

@zmoog
Copy link
Contributor

zmoog commented Aug 17, 2023

For the records, @devamanv also put together a few other observations about differences in hosting plan options.


### Metrics
**Metrics** give you insight into the state of your Azure costs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description here is not relevant to the azure function metrics. Can you please remove the cost specific information and add appropriate description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unwanted details from the description.

processors:
- set:
field: ecs.version
value: "8.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the ECS version to the latest stable version. Also, make sure both the logs and metrics datastream have a sets the same ECS version.

| azure.functions.bytes_sent.total | The amount of outgoing bandwidth consumed by the app, in MiB. | long |
| azure.functions.current_assemblies.average | The current number of Assemblies loaded across all AppDomains in this application. | long |
| azure.functions.file_system_usage.average | Percentage of filesystem quota consumed by the app. | long |
| azure.functions.function_execution_count.total | Function Execution Count. For FunctionApps only. | long |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add metric_type for the applicable fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added metric_type as well as unit wherever applicable.

@@ -6,19 +6,22 @@ Use this integration to build web APIs, respond to database changes, process IoT


## Data streams
The Azure Functions integration contains two data streams: [Logs](#functionapplogs) and [Metrics](#metrics)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you linking any references in the logs and metrics ? I don't see a link attached to it. I see this is done to reference the title below but literally, when clicking the link I am not able to go the metrics section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the link, now it correctly navigates to the metrics section.

packages/azure_functions/docs/README.md Outdated Show resolved Hide resolved
@muthu-mps
Copy link
Contributor

Few more comments,

  • Add the dashboard screenshot image to the img dir. This image gets loaded on the Kibana integrations front page.

  • I do see the readme file is mostly referring to the billing metrics. Please add the document specific to Azure Functions.

Screenshot 2023-08-17 at 4 36 56 PM

@zmoog
Copy link
Contributor

zmoog commented Aug 17, 2023

Hey @muthu-mps, thank you for taking the time to read this in detail! I missed the leftovers from the Azure Billing setup information we used as the template.

@devamanv, we need to update the docs specifying the role with the least privileges to access Azure Monitor.

@devamanv
Copy link
Contributor Author

devamanv commented Aug 21, 2023

we need to update the docs specifying the role with the least privileges to access Azure Monitor.

I have pushed the changes to only keep the Functions specific information in the docs.

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits.

However, since this is a technology preview integration, if we want to ship this to gather feedback, we can address them later.

packages/azure_functions/changelog.yml Outdated Show resolved Hide resolved
@devamanv devamanv requested a review from muthu-mps August 23, 2023 05:46
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.1.0"
changes:
- description: Add Azure Functions metrics data stream and a Kibana dashboard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datastream comes with a dashboard. We can remove the and a Kibana dashboard from changelog entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to address one of the review comments. However, since a datastream is always expected to have a dashboard, we can remove the explicit mention of it.

external: ecs
- name: service.address
type: keyword
description: Service address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no need to add description for ECS fields as they are generated at build time. Also this would override the already present description of the ECS field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aliabbas-elastic - The next question is why we need to define a type for the ECS field. @devamanv we need to reference the ECS field here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. Yes, we should remove the description and the type fields, and add the missing externalfield. Will make this change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@muthu-mps, Is there a ECS field mapping guideline document, which captures the correct way to map ECS fields?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lalit-satapathy - I don't find a guidelines document for mapping ECS fields. I do see the service.address is mapped directly in some and referencing ECS in some integrations.

@muthu-mps
Copy link
Contributor

/test

"type": "Microsoft.Web/sites",
"group": "test-rg",
"tags": {
"hidden-link: /app-insights-resource-id": "/subscriptions/12hjkls-78tyu-404f-a3d2-1dc9982f45ds/resourceGroups/test-rg/providers/Microsoft.Insights/components/return-of-the-jedi"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create custom tags and associate them with this function and get the metrics out of it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tags are coming from Azure monitor itself. To add custom tags, we will have to add a new field for tags and conditionals to the agent configuration to set it. I can give that a try but could be taken as an enhancement. I will create a GH issue for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the tags field is dynamic, not necessary to add a new field. You can just try adding custom tags to the functions and check if the key values are captured. Ex: tags.*

@muthu-mps muthu-mps self-requested a review August 25, 2023 01:41
Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good!

@devamanv devamanv merged commit 7cf179e into elastic:main Aug 25, 2023
@elasticmachine
Copy link

Package azure_functions - 0.1.0 containing this change is available at https://epr.elastic.co/search?package=azure_functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Metrics data stream for Azure Functions
7 participants