-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Platform][Alerting] Reducing the cost of change in data migrations #96291
Comments
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
I can't comment on much here, but for For |
That's an interesting thought, which hadn't actually crossed my mind. |
🙏 |
Correct, I don't think we'll need this before 8.x. |
Thank you so much for raising this @gmmorris! This has been something that's been a consistent pain point on the Detections side as we've continued to add more and more features since debuting back in 7.6. It was in preparation for GA (and a full backlog of features itching to change the mapping :) that lead us to create the Detection Alerts Migration API to at least provide some helper to our users in the event we would need to make a migration. For the most part we've been able to just add the new mappings with the only annoyance being that a privileged user would need to visit the Security App to trigger a rollover for the new mappings to be available, which left us in an interesting place with exposing new features that might not be supported till the right user visited the app. This shouldn't be an issue with RAC though since we're using the Either way, a platform level service for performing migrations on upgrade would go a long way in both ensuring the flexibility is there if needed, and putting us in a position to quickly iterate towards the ideal implementation without the need to over-engineer it up front in order to support what we think we may need for the next two years without a breaking change.
Frankly, we haven't leveraged too many aggregations on top of alerts just yet (only the Alerts Histogram/Last Alert component), so it hasn't really been an issue. That said, we'll probably start introducing them more and more as we start providing advanced alert triage flows with groupings. Either way, I do like the approach of using runtime fields as the trialing of new fields before they get added to the main schema. I'm working on a runtime field reference rule, so should hopefully know a bit more here soon with regards to what that may look like. (This will be crucial for users to add their own fields to alerts since we won't be supporting composable index templates.) |
That's a different need than what is specified in #91143, right? #91143 is about migrating type A objects to type B, where what you're asking here is to be able to access arbitrary objects from type B during the migration of objects of type A, right? |
Correct 😬 |
Thanks for the context :)
Could you expand on this @spong ? Judging by the docs, I'm guessing it's only the mappings:
I don't think we'd have much of an option of migrating actual data (as the amount of data is going to be huge), but I do wonder if there are changes we will find hard to make i the future if we're limited to only mappings changes. 🤔
This is exactly why I feel we need to discuss formalising this kind of migration.
++ Any chance you could enumerate specific features you'd expect this platform service to provide?
Brilliant, that's a good start. |
Runtime fields and aliases can be helpful ways to reduce the impact of this problem, but for any domain there's always inevitably something new you learn that requires you to model your data in a completely different way where just changing the mappings isn't enough, you also need to change your documents. Ultimately we want to help developers shift this complexity to the migration system so that their business logic is free from if/else clauses dealing with all the past modelling mistakes. To solve (1) and (2) I've been thinking of a slightly different migration algorithm that's eventually consistent in order to give us the performance over large data sets: #96626 For (3) the naive solution is to allow performing an async read for every transformed document, but this would be very inefficient. The more performant solution is what we documented in #34996 which would basically allow the migration to collect some data before it starts and then use that data for transforming every document. This means you need to be able to fit all the data you need in memory but this is probably sufficient for most of the migrations we're doing. |
I tend to agree with all of that. :) Regarding #96626: I think it's a great idea. Given the problems we've seen these past few weeks with the growing task and action SOs, this would reduce the likelihood of breaking a customer upgrade. |
Context
As the RAC initiative progresses, we're making decisions about the shape of the Alerts-as-data indices and various Saved Objects.
These decisions are based on what we know now about the needs of Security, O11y and Stack Rules (including ML and Maps), but I think it's safe to assume we're going to get some thing wrong, and other things are likely to change (as new requirements come in, and we broaden the scope to include other features as part of 8.x).
As thing stand, the cost of change to the shape of data (both in system/hidden indices and SOs) is so high that we tend to think of these as prohibitive. This fear has historically made it hard for the Alerting team to make decisions about the correct generic approach, slowing us down.
I'd like to use this issue to discuss how we can change that.
Key Challenges
Below are the key challenges, making the cost of change higher than we'd like it to be:
rule
SavedObject has ataskId
field that points at atask
SO type. We need to pluck the API keys from therule
, and migrate it onto thetask
that is referenced by thetaskId
field. We have no way of doing this.cc @kobelb @spong @tsg @alexh97 @mfinkle @elastic/kibana-core
The text was updated successfully, but these errors were encountered: