Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Index Patterns] Excessive index patterns fetching on a dashboard #82153

Closed
Dosant opened this issue Oct 30, 2020 · 13 comments · Fixed by #82603 or #83368
Closed

[Index Patterns] Excessive index patterns fetching on a dashboard #82153

Dosant opened this issue Oct 30, 2020 · 13 comments · Fixed by #82603 or #83368
Labels
bug Fixes for quality problems that affect the customer experience Feature:Dashboard Dashboard related features Feature:Data Views Data Views code and UI - index patterns before 8.0 Feature:Saved Objects

Comments

@Dosant
Copy link
Contributor

Dosant commented Oct 30, 2020

Kibana version: 7.9.3

Describe the bug:

On cold load of a dashboard with multiple visualisations each visualisation get its index patterns. It is often the case that index pattern is the same.

Multiple visualisation in parallel calling: https://github.com/elastic/kibana/blob/master/src/plugins/data/common/index_patterns/index_patterns/index_patterns.ts#L360 for the same index_pattern.

There is a cache mechanism, but it isn't used until an index pattern is loaded. So when getting the same index pattern from multiple places until it is loaded savedObjectsClient.get will be called multiple times with for the same object.

Then savedObjectsClient.get has a batching mechanism, but it still retrieves multiple identical saved objects.

Steps to reproduce:

  1. Open a dashboard
  2. find a bulk_get request
  3. one of the requests would be for index patterns and in response you will see bunch of identical objects. Index patterns SO could be huge and it could cause significant impact on first load.

Expected behavior:

No redundant index patterns objects flying over the network.

Screenshots (if relevant):

image

Any additional context:

  1. We could improve caching mechanisms on index pattern service level, by caching on-going promises.
  2. OR we could improve batching mechanism in SO client with dedupe. in this case everyone would benefit
@Dosant Dosant added bug Fixes for quality problems that affect the customer experience Feature:Dashboard Dashboard related features Feature:Saved Objects Feature:Data Views Data Views code and UI - index patterns before 8.0 Team:AppArch labels Oct 30, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@Dosant Dosant changed the title [IndexPatterns] Excessive index patterns fetching on a dashboard [Index Patterns] Excessive index patterns fetching on a dashboard Oct 30, 2020
@Dosant
Copy link
Contributor Author

Dosant commented Oct 30, 2020

OR we could improve batching mechanism in SO client with dedupe. in this case everyone would benefit

@elastic/kibana-platform, do you think it makes sense to improve on SO client level?

@mattkime
Copy link
Contributor

mattkime commented Nov 2, 2020

Thanks for finding this @Dosant - very helpful. I think it would be worthwhile to alter the Index Pattern service cache to address this, independent of the decision to fix this at a saved object client level.

@pgayvallet
Copy link
Contributor

@elastic/kibana-platform, do you think it makes sense to improve on SO client level?

Well, short answer would be no. bulkGet does what it is supposed to do, it bulkGets from ES, convert to SO and return the result. Also other consumers may be using the same API, and expects the number or resulting objects to have the same length as the requested objects, changing this behavior to remove dupes would be a breaking changes for them.

It's really at the consumer discretion to have the appropriate logic to avoid requested the same object multiple times when performing a bulkGet call imho.

@rudolf, wdyt?

@Dosant
Copy link
Contributor Author

Dosant commented Nov 2, 2020

Also other consumers may be using the same API, and expects the number or resulting objects to have the same length as the requested objects, changing this behavior to remove dupes would be a breaking changes for them.

  • It is still possible to make inner optimization without changing the behavior, correct?
  • Alternatively this probably could instead be improved on processBatchQueue level

@milkfinch
Copy link

milkfinch commented Nov 3, 2020

Hi,

One note here is that the dashboard with 50 panels is loading very fast in Kibana 6.8.3 and in the bulk_get the request JSON has the index pattern ids only once and not as many times as many panels are on the dashboard.

The behavior in 6.8.3:

  • you have 3 panels with 2 indexA and 3 indexB patterns
  • bulk_get sends only one indexA and one indexB IDs in the request JSON

The behavior in 7.9.3:

  • you have 3 panels with 2 indexA and 3 indexB patterns
  • bulk_get sends indexA ID two times and indexB ID three times in the request JSON

Of course, with 5 panels the loading time is not so different but with 50 panels after the bulk_get the browser spends 20+ seconds to parse the response in 7.9.3.

Regards,
Robert

@pgayvallet
Copy link
Contributor

pgayvallet commented Nov 4, 2020

It is still possible to make inner optimization without changing the behavior, correct?

Sorry, I misunderstood the problem. I thought you were calling bulkGet with duplicates... You are actually calling get multiple times with the same type/id tuple, and the resulting bulkGet that fires from

private processBatchQueue = throttle(

is sending the duplicates to the backend in the bulk_get call.

In that case you are right. Even if it's an implementation detail, this should be optimized at the SO client's level. I would call that an enhancement/ perf optimization though, not a bug, as that doesn't lead to any error.

Created #82581

@pgayvallet
Copy link
Contributor

Just opened #82603 that closes #82581. Should I also indicate that the PR fixes this issue?

@Dosant
Copy link
Contributor Author

Dosant commented Nov 4, 2020

Should I also indicate that the PR fixes this issue?

I think so, yes, thank you!

Could be closed unless @mattkime also wants to polish this on index pattern level.
IMO we are good with the fix in core.

@mattkime
Copy link
Contributor

mattkime commented Nov 4, 2020

@Dosant I don't see a reason to make index pattern related changes considering @pgayvallet 's PR.

@Dosant
Copy link
Contributor Author

Dosant commented Nov 5, 2020

@mattkime, just thought of the small bit we still could improve on index pattern side:

With #82603 we optimized data over the network, but on index pattern layer we still waste time on scripting while initializing index patterns from saved object, because we check for cache hit only in the beginning of get(){..} method.

This redundant processing could accumulate into significant blocking time chunks for large index patterns when we load the dashboard with bunch of visualization:
Here is a 7.9 example:
Screenshot 2020-11-05 at 10 01 09

I think we should check a cache hit twice:

  1. once before fetching index patterns
  2. and then after fetch has resolved.

in 2 it is possible that some other task has already put a final index pattern object into cache and we no longer need to spend cpu ticks to initialize the same index pattern again. This micro optimization should safe significant amount of cpu time on initial loading of large dashboards with large index patterns.

pseudo code:

Now:

 get = async (id) => {
    if (cache.has(id)) return cache.get(id);
    const so = await fetch(id);
    const ip = await init(so);
   cache.set(id, ip);
   return ip;
 } 

Improvement::

 get = async (id) => {
    if (cache.has(id)) return cache.get(id);
    const so = await fetch(id);
    if (cache.has(id)) return cache.get(id);  // <---- check cache again to avoid possible redundant init() call and safe cpu cycles
    const ip = await init(so);
   cache.set(id, ip);
   return ip;
 } 

@Dosant Dosant reopened this Nov 5, 2020
@pgayvallet
Copy link
Contributor

unsubscribing now that #82603 is closed. Feel free to re-ping if you need anything else from our side.

@mattkime
Copy link
Contributor

mattkime commented Nov 5, 2020

@Dosant Thats a good point. I'm also thinking of the case where we are loading the field list on the fly. We'd be hitting the field caps api pretty hard if we're not caching at the appropriate level. I agree that this is worth doing.

@mattkime mattkime linked a pull request Nov 13, 2020 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Dashboard Dashboard related features Feature:Data Views Data Views code and UI - index patterns before 8.0 Feature:Saved Objects
Projects
None yet
5 participants