Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Infrastructure UI] investigate any impact of synthetic source in Infrastructure UI #139391

Closed
neptunian opened this issue Aug 24, 2022 · 5 comments
Assignees
Labels
Feature:Metrics UI Metrics UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@neptunian
Copy link
Contributor

As part of https://github.com/elastic/obs-infraobs-team/issues/805, see if are using _source anywhere that might be impacted with the implementation of synthetic source and track them here.

@neptunian neptunian added the Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services label Aug 24, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

@smith smith added the Feature:Metrics UI Metrics UI feature label Aug 30, 2022
@Kerry350 Kerry350 self-assigned this Sep 1, 2022
@Kerry350
Copy link
Contributor

Kerry350 commented Sep 6, 2022

I've done a pass of the Metrics code (Felix looked at Logs here: https://github.com/elastic/obs-infraobs-team/issues/805#issuecomment-1195455773).

There are some objects pulled from _source, for example:

system.process.summary typed as rt.record(rt.string, rt.number), with simple keys and values this would be fine.

The process list pulls the following: _source: ['system.process.state', 'user.name', 'process.pid'], which is typed as:

    rt.type({
                  _source: rt.type({
                    process: rt.type({
                      pid: rt.number,
                    }),
                    system: rt.type({
                      process: rt.type({
                        state: rt.string,
                      }),
                    }),
                    user: rt.type({
                      name: rt.string,
                    }),
                  }),
                })

again the actual values are simple types, so this looks fine.

The only two potential issues I could see were:

getNodeInfo pulls _source: ['host.*', 'cloud.*', 'agent.*'],. Given the wildcard nature here and just pulling all fields, some fields may in some way run in to the synthetic caveats.

When dealing with ML records the influencers are typed as follows:

influencers: rt.array(
        rt.type({
          influencer_field_name: rt.string,
          influencer_field_values: rt.array(rt.string),
        })
      ),

Depending on how this is used it could run in to the pushes all arrays to the "leaf" fields caveat in elastic/elasticsearch#85649.

I don't think any of the other caveats are an issue, e.g. sorting etc.

@neptunian
Copy link
Contributor Author

Thanks for looking into this @Kerry350 . So for the two potential issues it sounds like we'd need to dig into the UI and see exactly how we are using them? Was hoping that could be done as part of this issue though not sure you'll have time before 8.5. I can take that over if need be.

Was thinking we could also test by changing the mappings and ingesting data elastic/elasticsearch#85649

@Kerry350
Copy link
Contributor

Kerry350 commented Sep 7, 2022

Was hoping that could be done as part of this issue though not sure you'll have time before 8.5. I can take that over if need be.

I didn't have time to look at this in depth in the UI earlier (I wanted to do a pass on the code first, as there's sometimes combinations / states that get missed testing via the UI), but I had some time this evening.

For the ML issue the influencers are always reduced to an array of strings on the server side. So at least the surface area is reduced to there. I haven't been able to test the ML jobs fully yet with synthetic source (more below).

For the wildcards from getNodeInfo (used for the /api/infra/metadata API) these don't seem to cause an issue within the UI in the way they're used. (Again more below on trying to test this).

Was thinking we could also test by changing the mappings and ingesting data elastic/elasticsearch#85649

100% - however, I don't know if there's a much easier way to test this realistically, but I really struggled. I was using Metricbeat, and wanting to update the mappings to use:

"_source": {
  "mode": "synthetic"
}

(One thing to note is it's mode: synthetic which is slightly different to the PR).

but I kept having problems:

  1. Can't issue a simple partial index template update, as the whole template needs to be sent when issuing a PUT _template/<template>
  2. Tried to send the whole template, received a request too large error.
  3. Tried to compose a separate component template into the index template, however this results in various mapping conflicts. Because synthetic source is only compatible with certain field types, it results in conflicts with many mappings (which are in Metricbeat, but the Infra UI aren't necessarily interested in).
  4. Can't really work around that as the whole set of mappings is applied regardless of modules / metricsets.
  5. Hard to scale down the template to just the fields the UI uses (but also unrealistic as in a real world scenario the error would occur with the other mappings).

Did you have any ideas on how to apply this, whilst also keeping the data ingestion realistic? Even building from source and adding the mapping, it will conflict with many others 🤔

The best I was able to do was just apply the synthetic mode on the index template, and ingest data which fell back to ES guessing the mapping types for the fields. Which is no use really.

These are the sort of things that occur when trying to utilise this against Metricbeat. Although on the last one I think I just found a bug in the index management UI 😅

Screenshot 2022-09-07 at 01 38 08
Screenshot 2022-09-07 at 01 41 21
Screenshot 2022-09-07 at 01 45 10
Screenshot 2022-09-07 at 01 54 07

@neptunian neptunian self-assigned this Sep 12, 2022
@neptunian
Copy link
Contributor Author

neptunian commented Sep 12, 2022

I also tried adding ?force_synthetic_source to queries and got 400 responses when trying to query the metricbeat data stream. Will see if I can more info on the status of metricbeat's mappings compatibility with synthetic source so we can test with real data.

For now just looking at how the code uses the issues mentioned:

getNodeInfo returns a promise that returns the _source of the first hit of results with type InfraMetadataInfo and looking at each type, there doesn't look to be any issues.

For ML anomalies I think there would be an issue. Currently we query .ml-anomalies-shared and would get an anomaly document back that looks like this:

      {
        "_index": ".ml-anomalies-shared",
        "_id": "kibana-metrics-ui-default-default-hosts_memory_usage_record_1662948000000_900_0_97975001574843114786968312574454586785_6",
        "_score": 1,
        "_source": {
          "job_id": "kibana-metrics-ui-default-default-hosts_memory_usage",
          "result_type": "record",
          "probability": 7.758294970121287e-19,
          "multi_bucket_impact": -5,
          "record_score": 53.98588048634285,
          "initial_record_score": 53.98588048634285,
          "bucket_span": 900,
          "detector_index": 0,
          "is_interim": false,
          "timestamp": 1662948000000,
          "partition_field_name": "host.architecture",
          "partition_field_value": "X86_64",
          "function": "max",
          "function_description": "max",
          "typical": [
            0.5418041229645421
          ],
          "actual": [
            0.8999999761581421
          ],
          "field_name": "system.memory.actual.used.pct",
          "influencers": [
            {
              "influencer_field_name": "host.name",
              "influencer_field_values": [
                "host-0",
                "host-1",
                "host-2",
                "host-3",
                "host-4",
                "host-5",
                "host-6",
                "host-7",
                "host-8",
                "host-9"
              ]
            },
            {
              "influencer_field_name": "host.architecture",
              "influencer_field_values": [
                "X86_64"
              ]
            }
          ],
          "host.architecture": [
            "X86_64"
          ],
          "host.name": [
            "host-0",
            "host-1",
            "host-2",
            "host-3",
            "host-4",
            "host-5",
            "host-6",
            "host-7",
            "host-8",
            "host-9"
          ]
        }
      }

We filter on the influencers property above. We filter for all object influencers of host.name and then reduce through each object (would there ever be more than one influencer of the same field, in this case, host.name?) pushing all of the influencer_field_values into one array. We display this list in the Anomaly Table.

Since I can't actually test what it will look like with synthetic source turned on I am thinking influencers would now look something like this (partial doc):

{
  "influencers": {
    "influencer_field_name": ["host.name", "host.architecture"],
    "influencer_field_values": [
        "host-0",
        "host-1",
        "host-2",
        "host-3",
        "host-4",
        "host-5",
        "host-6",
        "host-7",
        "host-8",
        "host-9",
        "X86_64"
      ]
    }
  }
}

If this is the case our code would break. I brought this to the attention of the ML team and @droberts195 and he said they have no plans to support synthetic source for the ML internal indices. The reasoning being that the "low(er) volume ML results indices can stay as they are now" as they would not see much of the benefits as higher volume indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Metrics UI Metrics UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

No branches or pull requests

4 participants