Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discover: Support accessing failure store documents #184092

Closed
flash1293 opened this issue May 23, 2024 · 21 comments
Closed

Discover: Support accessing failure store documents #184092

flash1293 opened this issue May 23, 2024 · 21 comments
Assignees
Labels
Feature:Discover Discover Application Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. Team:obs-ux-logs Observability Logs User Experience Team

Comments

@flash1293
Copy link
Contributor

flash1293 commented May 23, 2024

The failure store is a new feature that will ship as tech preview in Elasticsearch 8.15. It allows to capture documents that couldn't be processed in a separate index per data stream to allow troubleshooting and re-indexing.

As part of the troubleshooting process, it's important to provide a way to look at the documents in the failure store. As Discover is the main app to look at individual documents, it should be possible to also view these docs.

This is complicated by the fact that data fetching work slightly different as an additional query parameter needs to be added to specify whether failure store docs are included into the search or not.

Implementation

The functionality should be added as a hidden flag in the discover app state that's passed along to data fetching endpoints:

  • Field list
  • Document search
  • Histogram aggregation
  • Source fetching
  • Context fetching

The flag is synced to a URL parameter. There is no UI element for the user to actively change the value of the flag, but the discover locator has to support it as a parameter to be set when linking from other applications.

It will also be possible to control this flag within the logs explorer.

References:

@flash1293 flash1293 added the Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. label May 23, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@flash1293 flash1293 added the Team:obs-ux-logs Observability Logs User Experience Team label May 28, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

@weltenwort
Copy link
Member

In order to work well with the Logs Explorer it would also be required for this flag to be represented in the app state. That would be probably be the case anyway, just wanted to mention it.

@davismcphee
Copy link
Contributor

To @weltenwort's point and to make it clear on the Discover end, since failure stores are a general data stream feature and not something specific to Logs Explorer, we've agreed that it makes sense to add support to the core Discover codebase including any necessary additions to the app state.

@flash1293
Copy link
Contributor Author

Thanks to both of you for highlighting that point - this was my understanding as well, I clarified the description to keep everything in one place for the implementation.

@timductive
Copy link
Member

timductive commented Jun 18, 2024

hey @flash1293 (and @ninoslavmiskovic) I'm trying to catch up on this topic. I understand that the current implementation uses query parameters on data-stream search requests to query the failure store and your POC is tying those query params to the data view. However, we may prefer that the ES API is changed to not use query params and support querying "out-of-the-box" like a normal index. Do you have more context on this?

@timductive
Copy link
Member

Elasticsearch team is investigating changing this api to be similar to a regular index search..
GET failures-logs-*/_search
At which point, the UI for searching failure store should be the same as any other index.
Managing/fixing failure store data UI doesn't change, it still needs to modify ingest pipelines, etc..

We need to align on the API before we move forward on the UI work.

@flash1293
Copy link
Contributor Author

@timductive I wasn't aware of plans to change the API - who is the point of contact about this?

@tylerperk
Copy link

tylerperk commented Jun 20, 2024

@flash1293 - @dakrone and @jpountz and @javanna are looking into it - we'll discuss it at Friday's ES&Obs sync

@ninoslavmiskovic
Copy link
Contributor

Agree here with @timductive to wait and get the API right before exploring the UI.

@gbamparop
Copy link
Contributor

@flash1293 if this will be blocked for a while, we could explore whether it makes sense to surface additional information for the failure store in the dataset quality UI.

@mattkime
Copy link
Contributor

I'm not sure I have full context and this seems worthy of a good pros and cons list but going with a regular index does seem like it would provide wider support from the start, including ESQL. ...but I can't imagine ESQL would be overlooked so I feel like I must be missing something.

@ghudgins
Copy link
Contributor

ghudgins commented Jan 6, 2025

elastic/elasticsearch#118614 unblocks this issue. at the minimum, it would be good to add support in the data view creation process for the failure store sytnax. can we use this issue to account for this work? or should we split it off? what do you think @thomasneirynck

@davismcphee
Copy link
Contributor

@ghudgins If it's all index pattern based, I think the only blocker for basic support in data views and Discover is #205109. We'll likely want to do more after like adding a dedicated Discover profile, etc., but I think resolving that issue will allow basic use.

@flash1293
Copy link
Contributor Author

Agreed @davismcphee - one smaller thing, would this trigger the logs profile right now (e.g. logs-mylogs*::failure)? If yes, we might want to slightly extend the existing profile resolution and let it fall back on the default, as the logs profile is not suited for the failure data at all. A dedicated profile can totally be a follow up though.

@yngrdyn
Copy link
Contributor

yngrdyn commented Jan 8, 2025

let it fall back on the default

This is what I was thinking. Logs profile is not useful for failure store documents, a dedicated one will bring more value
For example we could expose in a better way the error.type, error.message, stacktrace and whatever we think might suit better for this case. According to my investigations (manual test) the logs profile is currently triggered when ::failures is in the data view

Screen.Recording.2025-01-08.at.13.08.06.mov

But this brings me another question, what to do when data view is ::* which means it will bring documents from the normal dataStream and also the one holding documents in the failure store. Should we also stick to the default in that case?

@flash1293
Copy link
Contributor Author

flash1293 commented Jan 8, 2025

But this brings me another question, what to do when data view is ::* which means it will bring documents from the normal dataStream and also the one holding documents in the failure store. Should we also stick to the default in that case?

Good question, I guess we should, as otherwise the errors won't show. It seems to be an exotic case though. So my suggestion for the near term would be to not use the logs profile for ::failure and ::*.

@davismcphee
Copy link
Contributor

would this trigger the logs profile right now (e.g. logs-mylogs*::failure)? If yes, we might want to slightly extend the existing profile resolution and let it fall back on the default, as the logs profile is not suited for the failure data at all.

what to do when data view is ::* which means it will bring documents from the normal dataStream and also the one holding documents in the failure store. Should we also stick to the default in that case?

Both good points and I agree, the o11y logs profile should be updated to exclude failure store patterns and possibly all selector patterns for now at least. I'll add this as a topic for the One Discover sync tomorrow to figure it out.

@davismcphee
Copy link
Contributor

We discussed this today in the sync and decided to exclude patterns that use the selector syntax from the o11y logs profile for now. The issue for that work is #206092.

@yngrdyn
Copy link
Contributor

yngrdyn commented Jan 21, 2025

@flash1293 now that #206092 was implemented, can we close this issue?

@flash1293
Copy link
Contributor Author

Yes, thanks @yngrdyn

@gbamparop gbamparop added the Feature:Discover Discover Application label Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Discover Discover Application Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. Team:obs-ux-logs Observability Logs User Experience Team
Projects
None yet
Development

No branches or pull requests