[Observability] Load hasData call asynchronously #80644

cauemarcondes · 2020-10-15T12:40:22Z

This PR intends to improve the performance of the Observability page.

The current architecture waits for all hasData requests to be completed to then verify to which page it should redirect the user.

The new architecture waits until at least one app returns true to the hasData request and immediately redirects the user to the correct page, while the remaining calls are being finished in background.
Before:

After:

Before:

After:

elasticmachine · 2020-10-15T13:39:02Z

Pinging @elastic/apm-ui (Team:apm)

elasticmachine · 2020-10-15T13:39:02Z

Pinging @elastic/uptime (Team:uptime)

dgieselaar · 2020-10-16T10:33:24Z

x-pack/plugins/observability/public/context/has_data_context.tsx

+    () => {
+      apps.forEach(async (app) => {
+        try {
+          const params =


Why does the UX app have a different interface for its hasData handler? And looks like it will not be updated because it's not a dependency?

I think @shahzad31 is the right person to answer this question.

Maybe @justinkambic knows?

@dgieselaar i believe it was a compromise between performance and consistency. For UX case we are displaying data for the service with most traffic, so in the same query where we are getting if it has rum data , it is also getting service name with most traffic via terms aggs.

I believe we need better types here to clear this to avoid this if conditions. Or maybe we can get service name via a different query or maybe make two es queries in data request. IMO consistency here isn't worth an additional query.

There was also discussion about this in original PR where this was added.

Hmm, I would expect that to be unlikely. The fact that you're running a terms aggregation to get the service name means that you cannot terminate early, because all documents have to be collected and aggregated (within the given time range). That will make your hasData call significantly slower. I don't think it's worth it, in fact, it's probably worse in terms of performance. Instead of running things in parallel, and getting certain parts of the data when they're ready, it's now one long blocking request.

it might be a case that terminate early flag isn't being passed correctly to es request.

dev-next doesn't have a lot of data. IIRC, half of our users have more data, and significantly less nodes. So, if it's slow on dev-next it's most likely an issue, but if it's fast, it doesn't really mean that it will be fast for our users. In this example, you'll likely have relatively high overhead of security checks etc, which happen for each request, so it's hard to tell how efficient the search is based on network timings.

I think it's good here to take a step back and analyze what's happening with this search:

{ size: 0, query: { bool: { filter: [{ term: { [TRANSACTION_TYPE]: TRANSACTION_PAGE_LOAD } }], }, }, aggs: { services: { filter: { range: rangeFilter(start, end), }, aggs: { mostTraffic: { terms: { field: SERVICE_NAME, size: 1, }, }, }, }, }, }

First all page load transactions in all indices are collected, because there is no range query on @timestamp.

Then, a filter aggregation runs to select only documents that match the given time range.

Then, another aggregation runs on all the documents for the given time range. size only removes the bucket from the response - it will still be calculated.

All these steps run sequentially (per document).

Especially the first step could be pretty slow for users that have a lot of RUM data.

If you separate this however, you'd have two searches that you can run in parallel, plus, even the total duration will be faster.

For the hasData search, you can set terminate_after to 1, meaning it will exit as soon as it has found a document. The performance cost of this search is negligible in most cases.

For the getServiceName search, you can simply add a range query in your bool clause. This will remove the need for a filter aggregation.

One example from DPP:

one dual-purpose search: 2000ms

hasData: 40ms

getServiceName: 90ms

Queries:

GET apm-*/_search?request_cache=false { "size": 0, "query": { "bool": { "filter": [ { "term": { "transaction.type": "page-load" } } ] } }, "aggs": { "services": { "filter": { "range": { "@timestamp": { "gte": "2020-05-12T10:00:00.000Z", "lt": "2020-05-12T11:00:00.000Z" } } }, "aggs": { "mostTraffic": { "terms": { "field": "service.name", "size": 1 } } } } } } GET apm-*/_search?request_cache=false { "query": { "bool": { "filter": [ { "term": { "transaction.type": "page-load" } } ] } }, "size": 0, "track_total_hits": true, "terminate_after": 1 } GET apm-*/_search?request_cache=false { "size": 0, "query": { "bool": { "filter": [ { "term": { "transaction.type": "page-load" } }, { "range": { "@timestamp": { "gte": "2020-05-12T10:00:00.000Z", "lt": "2020-05-12T11:00:00.000Z" } } } ] } }, "aggs": { "mostTraffic": { "terms": { "field": "service.name", "size": 1 } } } }

x-pack/plugins/observability/public/context/has_data_context.tsx

x-pack/plugins/observability/public/hooks/use_time_range.ts

x-pack/plugins/observability/public/pages/home/index.tsx

dgieselaar · 2020-10-16T10:41:32Z

x-pack/plugins/observability/public/pages/overview/data_sections.tsx

 }

 export function DataSections({ bucketSize, hasData, absoluteTime, relativeTime }: Props) {
  return (
    <EuiFlexItem grow={false}>
      <EuiFlexGroup direction="column">
-        {hasData?.infra_logs && (
+        {hasData?.infra_logs?.hasData && (


We can remove these statements, and add a hidden prop on the EUIFlexItem component. That will ensure the component itself will start loading the data.

Are you sure the hidden property works? I haven't found any description of it here https://elastic.github.io/eui/#/layout/flex, and I tried it and it always shows the child element.

It's a native HTML attribute. It should work, but if it doesn't, there might be some styles from EUI overriding it, or React special-casing it. We don't have to use hidden, but the point is to render the component so it starts the request, but hide it from the UI.

x-pack/plugins/observability/public/pages/overview/overview.stories.tsx

justinkambic

Functionally, looks good. I agree with some of the previous comments that we can probably improve code clarity a bit.

cauemarcondes · 2020-10-17T14:10:34Z

@elasticmachine merge upstream

cauemarcondes · 2020-11-05T12:46:33Z

Looks like there are two issues still outstanding, namely that the hasData interface for the UX app has a different interface,

@dgieselaar @shahzad31 since I'm going to be on PTO for the next weeks, I'd rather doing this change in another PR. I also want to refactor the fetchData, today it's called inside each component, I want to change it to also use the new Context created.

and the components still wait until hasData has returned for at least one app

I'm going to fix it.

cauemarcondes · 2020-11-05T16:35:26Z

retest

cauemarcondes · 2020-11-05T19:17:36Z

@elasticmachine merge upstream

sorenlouv · 2020-11-06T11:10:17Z

x-pack/plugins/observability/public/context/has_data_context.tsx

+          if (app !== 'alert') {
+            const params =
+              app === 'ux' ? { absoluteTime: { start: absStart, end: absEnd } } : undefined;


Why the special handling for alert and ux? I thought we had the same interface for all apps

For ux there's a tread where @dgieselaar and @shahzad31 discussed it, #80644 (comment). To remove it is necessary some refactor on the UX side, which would be better if we could fo it in another PR.

For alert, I can solve it if I manually register the alert service with registerDataHandler. WDYT?

sorenlouv · 2020-11-06T11:10:55Z

x-pack/plugins/observability/public/components/app/empty_sections/index.tsx

+    if (id === 'alert') {
+      const { status, hasData: alerts } = hasData.alert || {};
+      return (
+        status === FETCH_STATUS.FAILURE ||
+        (status === FETCH_STATUS.SUCCESS && (alerts as Alert[]).length === 0)
+      );
+    } else {
+      const app = hasData[id];
+      if (app) {
+        const _hasData = id === 'ux' ? (app.hasData as UXHasDataResponse)?.hasData : app.hasData;
+        return app.status === FETCH_STATUS.FAILURE || !_hasData;
+      }
+    }
+    return false;


We should get rid of the special logic for alert and ux here if possible

We can remove the alert logic based on my comment here. Ux will demand more changes with I'm not familiar to, maybe we could leave it like this for now and improve it in another PR.

x-pack/plugins/observability/public/components/app/section/uptime/index.tsx

sorenlouv · 2020-11-06T11:18:03Z

x-pack/plugins/observability/public/context/has_data_context.tsx

+      }
+    }
+
+    fetchAlerts();


Is it the right place to fetch alerts in the HasDataContextProvider?

I thought would be nice to centralize all requests to fill the page in a single place. Alert would have behaved in the same way as the other apps if I registered it. Maybe that is the change I should do.

x-pack/plugins/observability/public/hooks/use_time_range.ts

sorenlouv

overall lgtm. Just a few comments

…to obs-improve-perf

cauemarcondes · 2020-11-20T12:57:42Z

@dgieselaar @sqren @shahzad31 I don't want to spend more time on this PR, it has already consumed me a lot. I have improved the way hasData is fetched, now it doesn't wait until all APPs return to navigate, it waits until the first app that returns true. That has already improved the loading time.

I agree we must fix the UX hasData, to use the same signature like the others, but it can be done on another PR since this is already big.

sorenlouv

lgtm

weltenwort

Logs UI changes LGTM

and nice job at improving the responsiveness of the overview 👏

cauemarcondes · 2020-11-23T08:39:55Z

@elasticmachine merge upstream

kibanamachine · 2020-11-23T10:53:54Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 080edfe

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`observability`	108	111	+3

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`observability`	160.7KB	169.1KB	+8.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`observability`	72.1KB	68.6KB	-3.5KB

History

💚 Build #89242 succeeded 6d8a567
💔 Build #89233 failed bf819f8
💔 Build #89225 failed 4f0332d
💔 Build #89203 failed c7a12d6
💚 Build #86225 succeeded 88fd2e4

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* obs perf * fixing unit tests * fixing ts issues * fixing empty state * addressing pr comments * addressing pr comments * fixing TS issue * fixing some stuff * refactoring * fixing ts issues and unit tests * addressing PR comments * fixing TS issues * fixing eslint issue Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> # Conflicts: # x-pack/plugins/observability/public/pages/overview/index.tsx

* master: (67 commits) [Observability] Load hasData call asynchronously (elastic#80644) Implement AnonymousAuthenticationProvider. (elastic#79985) Deprecate `visualization:colorMapping` advanced setting (elastic#83372) [TSVB] [Rollup] Table tab not working with rollup indexes (elastic#83635) Revert "[Search] Search batching using bfetch (elastic#83418)" (elastic#84037) skip flaky suite (elastic#83772) skip flaky suite (elastic#69849) create kbn-legacy-logging package (elastic#77678) [Search] Search batching using bfetch (elastic#83418) [Security Solution] Refactor Timeline flyout to take a full page (elastic#82033) Drop use of console-stamp (elastic#83922) skip flaky suite (elastic#84011 , elastic#84012) Fixed usage of `isReady` for usage collection of alerts and actions (elastic#83760) [maps] support URL drilldowns (elastic#83732) Revert "Added default dedupKey value as an {{alertInstanceId}} to provide grouping functionality for PagerDuty incidents. (elastic#83226)" [code coverage] Update jest config to collect more data (elastic#83804) Added default dedupKey value as an {{alertInstanceId}} to provide grouping functionality for PagerDuty incidents. (elastic#83226) [Security Solution] Give notice when endpoint policy is out of date (elastic#83469) [Security Solution] Sync url state on any changes to query string (elastic#83314) [CI] Initial TeamCity implementation (elastic#81043) ...

* obs perf * fixing unit tests * fixing ts issues * fixing empty state * addressing pr comments * addressing pr comments * fixing TS issue * fixing some stuff * refactoring * fixing ts issues and unit tests * addressing PR comments * fixing TS issues * fixing eslint issue Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> # Conflicts: # x-pack/plugins/observability/public/pages/overview/index.tsx

cauemarcondes added 3 commits October 15, 2020 10:25

obs perf

7702b95

fixing unit tests

5afc484

fixing ts issues

0dd4a60

cauemarcondes added release_note:skip Skip the PR/issue when compiling release notes v7.11.0 labels Oct 15, 2020

cauemarcondes marked this pull request as ready for review October 15, 2020 12:48

cauemarcondes requested review from a team as code owners October 15, 2020 12:48

cauemarcondes requested a review from a team October 15, 2020 12:48

fixing empty state

f35e2c0

botelastic bot added Team:APM All issues that need APM UI Team support Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability labels Oct 15, 2020

dgieselaar reviewed Oct 16, 2020

View reviewed changes

x-pack/plugins/observability/public/context/has_data_context.tsx Outdated Show resolved Hide resolved

dgieselaar reviewed Oct 16, 2020

View reviewed changes

x-pack/plugins/observability/public/hooks/use_time_range.ts Outdated Show resolved Hide resolved

dgieselaar reviewed Oct 16, 2020

View reviewed changes

x-pack/plugins/observability/public/pages/home/index.tsx Outdated Show resolved Hide resolved

dgieselaar reviewed Oct 16, 2020

View reviewed changes

x-pack/plugins/observability/public/pages/overview/overview.stories.tsx Outdated Show resolved Hide resolved

justinkambic reviewed Oct 16, 2020

View reviewed changes

kibanamachine and others added 3 commits October 17, 2020 10:10

Merge branch 'master' into obs-improve-perf

38a67b9

addressing pr comments

d37c788

addressing pr comments

e5b149b

cauemarcondes requested a review from dgieselaar October 19, 2020 11:28

cauemarcondes added 4 commits October 19, 2020 14:06

fixing TS issue

af22249

Merge branch 'master' of github.com:elastic/kibana into obs-improve-perf

21ed767

Merge branch 'master' of github.com:elastic/kibana into obs-improve-perf

94eb333

fixing some stuff

7ba1593

refactoring

3f3fdd9

fixing ts issues and unit tests

97be452

Merge branch 'master' into obs-improve-perf

88fd2e4

sorenlouv reviewed Nov 6, 2020

View reviewed changes

x-pack/plugins/observability/public/components/app/section/uptime/index.tsx Outdated Show resolved Hide resolved

sorenlouv reviewed Nov 6, 2020

View reviewed changes

x-pack/plugins/observability/public/hooks/use_time_range.ts Outdated Show resolved Hide resolved

sorenlouv approved these changes Nov 6, 2020

View reviewed changes

cauemarcondes added 3 commits November 20, 2020 11:52

Merge branch 'master' of github.com:elastic/kibana into obs-improve-perf

77b993a

Merge branch 'obs-improve-perf' of github.com:cauemarcondes/kibana in…

c7a12d6

…to obs-improve-perf

addressing PR comments

4f0332d

sorenlouv approved these changes Nov 20, 2020

View reviewed changes

fixing TS issues

bf819f8

weltenwort approved these changes Nov 20, 2020

View reviewed changes

fixing eslint issue

6d8a567

shahzad31 approved these changes Nov 20, 2020

View reviewed changes

Merge branch 'master' into obs-improve-perf

080edfe

cauemarcondes merged commit ac73b6a into elastic:master Nov 23, 2020

cauemarcondes deleted the obs-improve-perf branch November 23, 2020 10:58

cauemarcondes mentioned this pull request Nov 23, 2020

[7.x] [Observability] Load hasData call asynchronously (#80644) #84057

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Observability] Load hasData call asynchronously #80644

[Observability] Load hasData call asynchronously #80644

cauemarcondes commented Oct 15, 2020

elasticmachine commented Oct 15, 2020

elasticmachine commented Oct 15, 2020

dgieselaar Oct 16, 2020

cauemarcondes Oct 19, 2020

dgieselaar Oct 20, 2020

shahzad31 Oct 20, 2020

dgieselaar Oct 20, 2020

shahzad31 Oct 20, 2020

dgieselaar Oct 20, 2020

dgieselaar Oct 20, 2020

dgieselaar Oct 20, 2020

shahzad31 Oct 20, 2020

dgieselaar Oct 16, 2020

cauemarcondes Oct 19, 2020 •

edited

Loading

dgieselaar Oct 20, 2020

justinkambic left a comment

cauemarcondes commented Oct 17, 2020

cauemarcondes commented Nov 5, 2020

cauemarcondes commented Nov 5, 2020

cauemarcondes commented Nov 5, 2020

sorenlouv Nov 6, 2020

cauemarcondes Nov 6, 2020

cauemarcondes Nov 6, 2020

sorenlouv Nov 6, 2020

cauemarcondes Nov 6, 2020

sorenlouv Nov 6, 2020

cauemarcondes Nov 6, 2020

sorenlouv left a comment

cauemarcondes commented Nov 20, 2020

sorenlouv left a comment

weltenwort left a comment •

edited

Loading

cauemarcondes commented Nov 23, 2020

kibanamachine commented Nov 23, 2020

[Observability] Load hasData call asynchronously #80644

[Observability] Load hasData call asynchronously #80644

Conversation

cauemarcondes commented Oct 15, 2020

elasticmachine commented Oct 15, 2020

elasticmachine commented Oct 15, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cauemarcondes Oct 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinkambic left a comment

Choose a reason for hiding this comment

cauemarcondes commented Oct 17, 2020

cauemarcondes commented Nov 5, 2020

cauemarcondes commented Nov 5, 2020

cauemarcondes commented Nov 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sorenlouv left a comment

Choose a reason for hiding this comment

cauemarcondes commented Nov 20, 2020

sorenlouv left a comment

Choose a reason for hiding this comment

weltenwort left a comment • edited Loading

Choose a reason for hiding this comment

cauemarcondes commented Nov 23, 2020

kibanamachine commented Nov 23, 2020

💚 Build Succeeded

Metrics [docs]

Module Count

Async chunks

Page load bundle

History

cauemarcondes Oct 19, 2020 •

edited

Loading

weltenwort left a comment •

edited

Loading