Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Move data stream mappings from index template to component template. #121184

Closed
Tracked by #108554
hop-dev opened this issue Dec 14, 2021 · 12 comments · Fixed by #124013
Closed
Tracked by #108554

[Fleet] Move data stream mappings from index template to component template. #121184

hop-dev opened this issue Dec 14, 2021 · 12 comments · Fixed by #124013
Assignees
Labels
Team:Fleet Team label for Observability Data Collection Fleet team technical debt Improvement of the software architecture and operational architecture v8.2.0

Comments

@hop-dev
Copy link
Contributor

hop-dev commented Dec 14, 2021

Currently, integration data stream mappings are set directly on the index template, we are proposing moving this to being set on a component template.

This will enable us to centrally store and manage the mapping in the component template while potentially having multiple data streams using that component template (e.g a namespace specific data stream).

Not having the data stream in a component template has blocked progress on #108554 (and meant we had to revert some ILM docs #119013).

Technical details

(See Jen's useful comment here)
We set the mappings in the index template here:

We also have the ability for packages to set custom mappings, so our solution would have to still be compatible with this behaviour:

if (registryElasticsearch && registryElasticsearch['index_template.mappings']) {

@hop-dev hop-dev added the Team:Fleet Team label for Observability Data Collection Fleet team label Dec 14, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@simitt
Copy link
Contributor

simitt commented Feb 3, 2022

I remember previous discussions where folks were worried around users potentially destroying their setup by manually changing mappings, if for example pipelines or dashboards/UIs were built with assumptions around the mappings. Have these concerns been addressed?

@hop-dev
Copy link
Contributor Author

hop-dev commented Feb 3, 2022

Hi @simitt, you are right, in the current design moving the mappings to the @mappings component template, the users customisations in the @custom component would currently have higher precedence. I had seen the user customising mappings in a damaging way as a relatively low risk.

One way we could reduce this risk (but not eliminate entirely) is to make the @mapping component template have higher precedence than the @customcomponent template?

Any thoughts here @joshdover ?

@simitt
Copy link
Contributor

simitt commented Feb 3, 2022

One way we could reduce this risk (but not eliminate entirely) is to make the @mapping component template have higher precedence than the @customcomponent template?

So basically only allow fields that are not part of the default @mapping to be customized right? Would this work with field hierarchies?

@joshdover
Copy link
Contributor

I think we should solve this problem in the high-level API we're planning to introduce as part of #121118 rather than trying to fix it here.

The reason being that I think we should view direct edits to the @custom templates as a "escape hatch" in the case we have a package bug or there's something the user really wants to customize, we should let them, but with a big caveat of "you're on your own here". It's also quite helpful for support to have some way to workaround our settings and mappings if needed.

The high-level API can provide more guardrails against this type of thing by only allowing additive mapping changes, for example.

@mostlyjason I think you should weigh in this too as you have more context of what the use case is intended to be for users to have direct access to @custom templates.

@mostlyjason
Copy link
Contributor

++ to what Josh said about offering an escape hatch. I remember in a previous company we were parsing apache status field as an int, and but its really a discrete rather than continuous variable, so keyword is better. The only way someone could fix that is by overriding the default mapping. Sure, a user could mess it up, but they can easily revert to the default behavior by deleting their customization. We can fix bugs in our templates, but it may not be on the user's timeline. This gives users control over their own cluster.

I don't have as much context about the high level API. If its an implementation concern I'd leave it to Josh and the eng team.

@simitt
Copy link
Contributor

simitt commented Feb 24, 2022

I have investigated this a bit further and it seems that mappings configuration can be extended with new fields, and dynamic templates that haven't been set on the index template can be added, but existing mappings and meta information, set on the index template, cannot be overwritten.

Example APM Traces

  1. Fleet setup when installing APM Integration:
  • index template traces-apm
{
  "_meta": {
    "package": {
      "name": "apm"
    },
    "managed_by": "fleet",
    "managed": true
  },
  "dynamic_templates": [
    {
      "strings_as_keyword": {
        "mapping": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "match_mapping_type": "string"
      }
    }
  ],
  "date_detection": false,
  "properties": {
    "container": {
      "properties": { .. }
    ...
    }
  }
}
  • component template traces-apm@mappings
{
  "dynamic": false,
  "dynamic_templates": [
    {
      "numeric_labels": {
        "path_match": "numeric_labels.*",
        "mapping": {
          "scaling_factor": 1000000,
          "type": "scaled_float"
        }
      }
    }
  ]
}
  1. Manual customization of component template:
    Changing metadata, enabling dynamic mapping and adding dynamic templates to receive following request:
PUT _component_template/traces-apm@custom
{
  "template": {
    "mappings": {
      "_source": {
        "enabled": true,
        "includes": [],
        "excludes": []
      },
      "_routing": {
        "required": false
      },
      "dynamic": true,
      "numeric_detection": false,
      "date_detection": true,
      "dynamic_date_formats": [
        "strict_date_optional_time",
        "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"
      ],
      "dynamic_templates": [
        {
          "numeric_labels": {
            "path_match": "numeric_labels.*",
            "mapping": {
              "scaling_factor": 1000,
              "type": "scaled_float"
            }
          }
        },
        {
          "strings_as_keyword": {
            "mapping": {
              "ignore_above": 2048,
              "type": "keyword"
            },
            "match_mapping_type": "string"
          }
        }
      ],
      "properties": {
        "agent.ephemeral_id": {
          "type": "text"
        },
        "simitt-test": {
          "type": "keyword"
        }
      }
    }
  },
  "_meta": {
    "package": {
      "name": "apm-1"
    },
    "managed_by": "fleet-1",
    "managed": false
  }
}
  1. Force a rollover via POST traces-apm-default/_rollover
  2. Inspect APM mapping via GET .ds-traces-apm-default-2022.02.24-000004/_mapping
{
  ".ds-traces-apm-default-2022.02.24-000004" : {
    "mappings" : {
      "dynamic" : "true",
      "_meta" : {
        "package" : {
          "name" : "apm"
        },
        "managed_by" : "fleet",
        "managed" : true
      },
      ...
      "dynamic_templates" : [
        {
          "numeric_labels" : {
            "path_match" : "numeric_labels.*",
            "mapping" : {
              "scaling_factor" : 1000,
              "type" : "scaled_float"
            }
          }
        },
        {
          "strings_as_keyword" : {
            "match_mapping_type" : "string",
            "mapping" : {
              "ignore_above" : 1024,
              "type" : "keyword"
            }
          }
        }
      ],
      "date_detection" : false,
      "numeric_detection" : false,
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "agent" : {
          "properties" : {
            "ephemeral_id" : {
              "type" : "keyword",
              "ignore_above" : 1024
            }..
          }
        },
        "simitt-test" : {
          "type" : "keyword"
        }
      }
    }
  }
}

@joshdover
Copy link
Contributor

I have investigated this a bit further and it seems that mappings configuration can be extended with new fields, and dynamic templates that haven't been set on the index template can be added, but existing mappings and meta information, set on the index template, cannot be overwritten.

That is the current behavior, but the change we're implementing in this issue would allow users to also override the out-of-the-box mappings from the integration as an escape hatch. Do you think this is a big problem for APM?

@simitt
Copy link
Contributor

simitt commented Mar 4, 2022

In #121118 and also in an earlier comment on this issue you mention allowing additive mapping. I am a bit confused now of whether or not users would then actually be able to also change the mappings that are shipped with the package. If that is planned, could you elaborate on the requirements for this? Everything I read in this issue, and the other linked one about a guided API does not motivate or require to change existing mappings, only to add new ones. Maybe I am just overlooking something, so apologies for asking this again.

I am all +1 on making a more guided UI/API and allowing users to add their own mappings. In my comment above I wanted to highlight that adding new mappings is already possible today, as I didn't expect that and thought it is worth sharing, in case others were confused by this too or need a workaround until some more tailored solution is available.

@hop-dev
Copy link
Contributor Author

hop-dev commented Mar 4, 2022

whether or not users would then actually be able to also change the mappings that are shipped with the package. If that is planned, could you elaborate on the requirements for this?

Hi @simitt, thanks for the comments. Allowing users to change the mappings supplied by the package is not a direct goal of this PR but an acceptable side effect. @joshdover has always described the @custom component template as an escape hatch for us so I guess we have now expanded what that template is capable of customising.

@nchaulet
Copy link
Member

nchaulet commented Apr 4, 2022

@hop-dev @joshdover Just tested the upgrade from 8.x and 7.x to 8.2 and the new @package component template is well created, during the upgrade the old and now unused @mappings template are keep it something we want to address and clean or it's an acceptable side-effect?

Screen Shot 2022-04-04 at 3 50 50 PM

@jen-huang
Copy link
Contributor

@nchaulet Thanks for testing! Could you open an issue for following up on the cleanup of @mappings? I suspect that we will probably want to clean them up (I assume we have cleaned up the old @settings templates?), but we can discuss in the new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Fleet Team label for Observability Data Collection Fleet team technical debt Improvement of the software architecture and operational architecture v8.2.0
Projects
None yet
7 participants