Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[telemetry] Fix Maps telemetry with dynamic fields #38394

Closed
Bamieh opened this issue Jun 7, 2019 · 18 comments
Closed

[telemetry] Fix Maps telemetry with dynamic fields #38394

Bamieh opened this issue Jun 7, 2019 · 18 comments
Assignees
Labels
Feature:Maps Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Presentation Presentation Team for Dashboard, Input Controls, and Canvas

Comments

@Bamieh
Copy link
Member

Bamieh commented Jun 7, 2019

Hello maps team 👋

I was updating the telemetry index mapping yesterday for maps, the telemetry sent for maps has 2 dynamic fields. We do not accept dynamic fields so these fields will not be used by telemetry

"layerTypesCount": {
   "dynamic": "true",
   "properties": {}
},
"emsVectorLayersCount": {
   "dynamic": "true",
   "properties": {}
}

I was hoping to understand what gets reported under those fields so we can map them differently to avoid dynamic fields.

Context: https://github.com/elastic/telemetry/pull/108

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-stack-services

@nreese nreese added the [Deprecated-Use Team:Presentation]Team:Geo Former Team Label for Geo Team. Now use Team:Presentation label Jun 7, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-gis

@Bamieh
Copy link
Member Author

Bamieh commented Jun 7, 2019

After discussing this with @aaronjcaldwell @pickypg we agreed to restructure emsVectorLayersCount and layerTypesCount into an array instead of dynamic fields.

"emsVectorLayersCount": {
   "properties" {
       "key":  { "type": "keyword"},
       "min": { type: "long" },
       "max": { type: "long" },
       "avg": { type: "long" },
   }
},
"layerTypesCount": {
   "properties" {
       "key":  { "type": "keyword"},
       "min": { type: "long" },
       "max": { type: "long" },
       "avg": { type: "long" },
   }
}

We avoid dynamic fields in the telemetry mapping. Additionally frequently adding / changing fieldnames in ES is an anti-pattern.

We'd be using these fields to check overall usage, which specific layers being adopted and which layers are getting used the most.

To achieve a more granular aggregation we'll have to reindex the data, but usually the basic use-cases are enough. We can also automate the reindexing if that is needed to capture some of those aggregations.

The largest benefit is that we can add new layers and types of layers and there’s absolutely no work that needs to get done by anyone to accept it.


An action item for this is to refactor to the above data structure properties: { key, max, min, avg} for the two dynamic telemetry fields the next time we make updates.

@kindsun kindsun self-assigned this Jun 7, 2019
@alexfrancoeur
Copy link

I just ran into this when trying to get some telemetry on our EMS usage 😄 . Any idea when we might be able to fix this? 7.3 doesn't seem to be possible at this point (but correct me if I'm wrong), what about 7.4?

@kindsun
Copy link
Contributor

kindsun commented Jul 16, 2019

Restructuring this on the code side shouldn't be too difficult. I'm happy to coordinate my work with the telemetry team's whenever makes sense!

@bmcconaghy
Copy link
Contributor

@aaronjcaldwell please coordinate with @Bamieh for whatever you need re: telemetry.

@elasticmachine
Copy link
Contributor

Pinging @elastic/pulse (Team:Pulse)

@Bamieh
Copy link
Member Author

Bamieh commented Jan 14, 2020

Meeting summary about solving the dynamic: true fields issues for maps (@aaronjcaldwell @afharo @TinaHeiligers )

Overview

We will aim to find a root solution to this issue with pulse. For now we dont have a go to solution for fields that have variables provided by the users.

Since the custom layers can be added by the users we dont have a pre-defined set of all the fields to map them.

Allowing dynamic fields of any kind in telemetry is not possible.

Solution

The current proposal is to implement a 3-step strategy of discovery, exploration and mapping.

  1. (Discovery) Add an array to store the layers provided by each cluster

This will be used strictly for discovering the layers provided by the users.
layer_key is a combination of ${name}__${type}.

{
  "mappings": {
    "dynamic": false,
    "properties": {
      "maps": {
        "properties": {
          "layers": {
            "properties": {
              "layer_key": {
                "type": "keyword"
              },
              "layer_name": {
                "type": "keyword"
              },
              "layer_type": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}
  1. (Exploration) Run aggregation on one of the layer_key, layer_name, or layer_type to discover the layers are provided by all our users.
{
  "size": 0,
  "aggs": {
    "layers": {
      "terms": {
        "field": "maps.layers.layer_key"
      }
    }
  }
}

The Maps team will need to use this data to decide which user-defined layer types and names to gather telemetry from and request an update to the mapping through the pulse team.

  1. (Mapping Update) Layers that seem interesting (from the exploratory phase) can be explicitly mapped under layerTypesCount and emsVectorLayersCount. This way we can control the fields that are mapped, while still keeping the dynamic: false property of telemetry.
{
  "mappings": {
    "dynamic": false,
    "properties": {
      "maps": {
        "properties": {
          "layers": {
            "properties": {
              "layer_key": {
                "type": "keyword"
              },
              "layer_name": {
                "type": "keyword"
              },
              "layer_type": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

Mapping these new fields requires the requesting team (in this case, the Maps team) to open an issue in the telemetry repo for the pulse team to handle. Although this will probably require some back and forth to map new layers to analyze them, it is the only plausible solution we could come up with for now.

Using an analyzer (for exploration of new fields)

It is worth exploring adding an analyzer for the layer_key field to be able to search on the layer_name, or type while still maintaining the relationship between the two since non-nested arrays cannot be queries in that manner.

Other explorations (dynamic_templates)

We have played around with using dynamic templates. The approach is great as it allows us to map fields that can follow a strict pattern. This does not work in the maps situation since users can be sending arbitrary layer names, hence re-creating the main issue why dynamic mapping is set to false in the first place.

{
  "mappings": {
    "dynamic": false,
    "properties": {
      "maps": {
        "type": "object",
        "dynamic": true
      }
    },
    "dynamic_templates": [
        {
          "mapsFields": {
            "path_match": "maps.count_*",
            "mapping": {
                "key": {
                  "type": "keyword"
              },
              "min": {
                  "type": "long"
              },
              "max": {
                  "type": "long"
              },
              "avg": {
                  "type": "long"
              }
            }
          }
        }
      }
    }
  ]
}

@afharo
Copy link
Member

afharo commented Dec 10, 2020

@Bamieh do you think we can close this issue now that we have the schema definitions?

Maybe @mindbat wants to provide some input regarding this piece of telemetry?

@afharo afharo added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed Team:KibanaTelemetry labels Dec 10, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@mindbat
Copy link

mindbat commented Dec 10, 2020

@afharo Perhaps with the push for moving to Telemetry-Next, we can re-open the discussion of dynamic fields for the Maps team, via an issue on the Infra repo?

@alexfrancoeur What do you think?

@afharo
Copy link
Member

afharo commented Dec 17, 2020

@mindbat sure! My point is that the current schema regarding maps provides the DYNAMIC_KEY keyword: https://github.com/elastic/kibana/blob/master/x-pack/plugins/maps/server/maps_telemetry/collectors/register.ts#L44-L50

If that's good enough for Telemetry-Next, I think we can close this issue.

@alexfrancoeur
Copy link

Before closing, we probably want the Maps team to weigh in. While I'm fortunate to have a history with this amazing team, I'm no longer part of it. @kmartastic @thomasneirynck thoughts on the Telemetry thread here?

@kmartastic
Copy link
Contributor

I'm not quite sure I understand what's being discussed. Actually, I'm quite sure I don't.

What is Telemetry-Next? Is that a project? schema? Is there a solution for working with dynamic fields within Telemetry-Next? We have a growing need for capturing this data as we are adding new layer types and want to track usage.

@alexfrancoeur @thomasneirynck @afharo

@afharo
Copy link
Member

afharo commented Dec 22, 2020

Hi @kmartastic,

Telemetry-Next is a new take to the way we store the data in the Remote Telemetry Service. Allowing dynamic keys, or even array structures, if preferred. For more details, reach out on the #telemetry-next Slack channel.

The schema is how Kibana defines the "contract" of the usage reported by each collector so the Remote Telemetry Service knows what to expect and what's changed in each version.

Re this specific issue, Maps have 2 fields that are objects where the keys cannot be known beforehand. For now, we've worked around them when defining their schema by setting the key name DYNAMIC_KEY. My question is whether we think that's enough for this use case or if we want to work further into a better structure/definition.

@teresaalvarezsoler
Copy link

@alexfrancoeur can we close this issue?

@jb1b84 jb1b84 added Feature:Maps Team:Presentation Presentation Team for Dashboard, Input Controls, and Canvas labels Nov 3, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-presentation (Team:Presentation)

@jb1b84 jb1b84 removed the [Deprecated-Use Team:Presentation]Team:Geo Former Team Label for Geo Team. Now use Team:Presentation label Nov 3, 2022
@afharo
Copy link
Member

afharo commented Nov 4, 2022

I'll go ahead and close this issue. From the last few comments, it seems that both teams are asking each other if we could close it 😅

Feel free to reopen if you think that any additional work is needed.

@afharo afharo closed this as completed Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Maps Feature:Telemetry Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Presentation Presentation Team for Dashboard, Input Controls, and Canvas
Projects
None yet
Development

No branches or pull requests