-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discuss: How to migrate a saved object that contains state from another plugin #56166
Comments
cc @peterschretlen This refers to the issue we talked about earlier with migrations in embedded visualizations |
Pinging @elastic/kibana-platform (Team:Platform) |
In NP we want to limit migrations to the "owner" of the type. If the embeddables plugin registers the embeddable type we don't want another plugin (elastic or third party) to be able to migrate it's data. I think it's important to maintain this ownership aspect. So if embeddables wants to allow other plugins to migrate the data it stores on their behalf, it should explicitly allow this and might want to add certain restrictions or safe guards. The way I see this implemented is for the embeddables plugin to expose a export class EmbeddablesPlugin {
constructor() {}
public setup(core: CoreSetup) {
return {
registerEmbeddableFactory(
factoryId: string,
factory: any,
migration: (input: Embeddable) => Embeddable
) {
/* do other embeddable stuff */
const wrappedMigration = safelyWrapMigration(factoryId, migration);
// This API isn't implemented yet...
core.savedObjects.registerMigration('embeddables', wrappedMigration);
},
};
}
}
function safelyWrapMigration(factoryId, migrationFn) {
return (inputObject: Embeddable) => {
if (inputObject.factoryId === factoryId) {
return migrationFn(inputObject);
} else {
return inputObject; // This migration wasn't allowed to alter this data, ignore and leave data unaltered.
}
}
} |
The suggestion above only solves for the case where the state migration is applied to a single type. If as in the example of saved search strategies the state lives in multiple saved objects owned by different plugins, each of those plugins will be responsible for registering a migration. Like other breaking API changes we will have to coordinate such changes with teams and external plugin authors. |
Discussed this with @kobelb, as we do this with security currently. They manage the "security" type, but it's usually on an object owned by another type. Here are my thoughts: Since a dashboard would have multiple other types, like visualizations, we could extend that support to allow for an array of objects of another type.
Here I am just piggybacking off of the references array, but I don't think that would be necessary as we could skip references and use something other than
Thoughts? |
I like this references approach. My initial reaction to this issue is this seems more like a data schema design problem rather than an issue with the migration mechanism. I think if a plugin needs to have any persisted state, it must be responsible for that state OR the plugin it depends on needs to provide a suite of tools for the state it stores on other plugins' behalf. However, I think even that solution has problems. The overarching design I have for plugin integrations is that all integration points between plugins should be via plugin contracts (setup and start), always in JavaScript. What I mean by that, is that plugins should not have any implicit APIs (eg. HTTP routes, shape of SavedObject state, global functions on However, when we constrain all integrations points to be only via setup and start contracts, we get an opportunity to detect, prevent, and handle breaking changes. Examples:
If we were to design Embeddables' state loading using plugin APIs, we could allow plugins to manage their own state and decide whether or not they need/want to write migrations. What this might mean in practice for embeddable factories, is that they would register:
When a dashboard is loaded, it would look at its list of panels, and instruct the embeddables plugin to load the state for each panel. The embeddables plugin would invoke the registered state retrieval function (2) for each embeddable type. When the dashboard is saved, the embeddables plugin would invoke the registered state persistence function (3). In the case of the URL service, this could mean that the URL service persists a reference to an object, which the consuming plugin would register a function for loading that object and generating a URL for it. That object could be anything (raw string, bag of options, array, etc.) and is totally up to the plugin that is being linked to. |
@tylersmalley - using your approach how would you handle state from actions stored in embeddables, inside dashboards? Who is applying the migration in your references approach? Is is the dashboard applying a migration written by the embeddable authors or are embeddable authors registering migrations on a dashboard? If the latter, then the author of a plugin needs to know every saved object it might appear in and I don't think we can make that assumption. I think what @joshdover suggests is what I am thinking would work - registry items which might end up in persisted storage would have to expose methods for serializing/deserializing state, and those methods should handle legacy state conversions. I just don't see how this ties into the reference approach.
This is what happens now, via I'm not sure I follow the URL comment: "the URL service persists a reference to an object". The URL service registers a URL saved object which the background search collection would point to? The difference between Josh's approach and my thinking is that it's not obvious to a plugin author that they have to handle legacy state migrations so formalizing this system might help, since this will be an issue when any registry item ends up serialized in some other saved object. I'll look to throw some time on the calendar because I'm curious to walk through your thoughts a little more. |
I think my explanation could benefit a lot from a concrete example of what I mean. I'll work on that before we meet. That said, I think the biggest distinction between what I'm proposing and what you are proposing is that I don't think the data should be persisted in the Dashboard at all. I think only an ID referencing a SO should be saved in the Dashboard and the plugin that owns the Embeddable type should handle fetching and persisting that object. With that separation, the plugin that owns the Embeddable type can then also handle migrating its state independently from the Embeddables plugin. If the Embeddables API needs to change, it should only be changing in JavaScript, which the plugin that owns the Embeddable type can then choose how to handle that API change. |
We talked about this briefly in the Area lead sync meeting the other day... continuing with the reference model has its own set of challenges which is why we were starting to go down the path of the nested "by value" model for dashboards. Not to say that we can't present an argument about why we should stop and change our minds, but we have to think about why we wanted to go with the nested/by value approach anyway. One was permission/security related - a user has access to a dashboard and everything on it - no conflict with permissions from other saved object types. The other reason is management and "library" items. A user adds an embeddable to a dashboard, this creates a saved object, but then the user never saves that dashboard - how is that embeddable deleted and cleaned up? If we tried not to create the referenced saved objects until the dashboard itself was saved, even then, what happens if the dashboard is deleted but not the references? What happens if the dashboard gets created but an error occurs with the referenced items? ES doesn't have rollback/transactional support. Maybe the plan would be to have the SO reference model handle all of that? |
To briefly elaborate upon this, references mean something "special" in the context of security especially with the introduction of sharing saved-objects in multiple spaces. The use of references currently also has an impact on end-users with regard to security. If a user only has access to Dashboards, they're unable to create any new Visualizations/Maps, etc. which end up being embedded in the Dashboards. While it's possible for us to continue to use separate Elasticsearch documents to model an "embedded" saved-object, it'll require that we differentiate between a "reference" and an "embedded reference". |
To me this seems like the cleanest option we have discussed so far in terms of making the ownership of portions of state explicit, and making it possible to run state migrations without the containers needing to track the contents of children, for example. If there's a single source of truth for each of the various types of state that need to be persisted, a lot of the other complexities go away (but admittedly, not all of them)
I'll be interested in discussing these more when we look at Josh's example. I think some of these challenges are simply trade-offs... a choice between:
So far, I have a hard time thinking of a third scenario where we aren't dealing with at least one of these two problems, but I would be happy to be proven wrong 😄 |
I think I would say this is Couple more comments:
|
I think going with an embedded reference model would also leave us in a situation with a lot of saved object types. For embeddables, this might feel more natural because the embeddables we have are all backed by saved objects, but what about expression fns? Right now they are stored in saved objects as a string, e.g.: I think what you are proposing is to no longer store just this string in a saved object but a reference to a saved object that represents each function. This would mean a new saved object type for every expression function type. That is a lot of new saved object types. This would also mean pretty awkward storage and parsing to/from expression reference model to expression string. I just want to make it obvious that the problem is not just with embeddables, but any extensible registry where state used to re-create an instance of the registry item ends up stored in a saved object:
This hasn't been a problem so far because:
This will become more of a problem as:
It becomes a communication challenge. The maps team needs to know that state to create their embeddable is stored in an expression function in canvas workpads. They want to change interfaces in a major version, they need to be aware of this persisted state and they need to either migrate it, or to continue to support legacy state. How is this tested and who handles the state migration (should canvas or maps team own the maps embeddable expression function?). Going with a saved object embedded reference model, I suspect, would only get us so far, and introduce a lot of complexity (e.g. a new saved object type for every registry implementation that might end up in a saved object). |
Posting summary of today's meeting, from my perspective. This boils down to two main problems:
We decided that 1. is more important to solve in the short term, while 2. can wait, potentially indefinitely. We talked about how it's important to formalize something so that every developer adding a specific implementation to a registry item that might be serialized and persisted knows they should handle legacy state migrations. Something like a We also mentioned how important it is to keep type history around for legacy types, as long as they are supported, or deprecated. For this reason, I think it might make sense for this system to be something like:
We could use the id as the versioning information, which I think would make typescript easier. Something like:
^^ That way every time you have an id you have the shape that it maps to. I suspect this might be easier to incorporate into our registry system as well since every registry already uses an id and state - no need to introduce a third If that didn't work out well though, we could probably come up with a system that was also closer to:
You'd just then need to store the shapes per id, per version. Regardless, we agreed it's important to keep type information around. |
Overall this aligns with my understanding of the discussion as well.
I don't want to tangent too much here, but technically for expressions this problem could still be solved with on-the-fly migrations using the interpreter's built-in types registry, which has the concept of type casting. You write a custom expression function, you say the types that it accepts for each arg, and each of the types can have built-in logic to handle casting to/from other types for you. Currently there's a plan to add a type converter registry so that built-in types can be extended by others, which could then be used for migrations such as these. Still doesn't solve the problem of an expression that's persisted in a SO with legacy data; I'm only pointing out that we can probably still solve this particular use case at runtime as a first step. |
Not to further derail, but I'm pretty sure the current cast types system in expressions could not handle this. For one, the type is just a string, and the shape depends on the value of Even if the author could type something like: args: {
filterConfig: [GeoFilterConfig, TermsFilterConfig, RangeFilterConfig]
} It can't list every option because it's extensible. The author of a custom filter may know nothing about expressions, so it might not know it should add a type casting for it. Though I do think something like the generic type casting system canvas wrote is what we are after. I just don't think the current version is good enough. |
Walked @timroes through the summary of the meeting this am and want to clarify a few confusing points here with some pseudocode that I think such a system could work like:
We could provide a helper function for these migrations, which would recursively loop through them all to ensure your instance was of the latest version:
Typescript types could do something similar to what search strategies does and what core does, with mappings of ID to shape:
The latest shape can always stay the same, when you write a migration, you rename the old state. This will avoid us having to rename types everywhere and make it easier to read. Plugins can extend this map:
|
Finally got around to putting some code together to demonstrate what I was talking about -- I have a feeling I am not understanding your use case correctly, but hopefully this helps explain my thought process.
The idea for how we handle this in TS all makes sense to me, but I think I'm not 100% following what |
Okay, so I think in your example code, it would work. but it would be built on top of a base migration system. Like you pointed out in your POC - it requires the type to not be I have code in #57496 which shows how I would build a generic system, though I just built it into the URL service. I will try to write a follow up PR for how I would make it work generically. |
The last bit of work to enable this is described in #84907 and likely to be merged soon so I think we can close this discussion. |
Our current migration system assumes that the developer who created and registered a new saved object type is in charge of migrating any outdated state in that saved object. We haven't thought about the situation where a saved object contains state that was built by other, potentially unknown, developers. With many of our new features however, this will become an issue.
Consider:
URLs
The migration issue with URLs is heavily discussed in that above link, but I'll recap the situation here:
Embeddables & actions
I recently realized that once we have embeddables that are "by value" instead of by reference, we will hit the same situation. The developer of dashboard saved objects is in charge of migrating them, but this developer doesn't know how to migrate the panels inside it, because those are extensible and owned by other developers.
There is actually a third layer here as well... actions. Actions can be attached to certain embeddables and will also be stored together with the embeddable input inside the dashboard. How will the author of an action update the state referenced in embeddables, inside dashboards.
Canvas workpads as well - we want to be able to update embeddables, but how do we migrate state from inside canvas workpads.
For URLs, the thought was to have each implementation expose a
migrate
method that the author of the saved objects can call. For Embeddables, this is a tougher problem and there are more considerations:isLegacy
.explicitInput
, and pass other state into the child via its own state. If we reconstruct the embeddable child, we can't just store the new input state fromembeddable.migrate().input
- how do we split the state apart again to store the correct portion inexplicitInput
? What happens if one child renames its input state fromfilters
to something else? Either the container, and every other child, would need to rename its input state to match, or we don't migrate and that child will no longer receive the correct data.Expressions
The expression string is stored in saved objects (Lens, Canvas workpads), but each expression function is extensible. How can a single developer of an expression function update the args or output shape, or expected input context without potentially breaking all those saved objects which reference it?
Even if we came up with a migration plan, we can still hit the issue where if the output or input shape changed, the previous, or following expression function may fail because it didn't update.
Expression system has a "cast types" system which we might be able to use.
Search strategies
What happens if we want to migrate the state a search strategy accepts? What if we need to migrate an expression function to account for that? For instance, hypothetical, SQL search strategy no longer supports some given syntax but this syntax is saved in an essql expression string? We have to potentially migrate canvas workpads, lens objects, dashboards (with in place embeddables). We also don't know for sure where else a plugin developer may be storing outdated state.
Any extension point / registry item
Essentially this boils down to having a generic migration system for anything stored in a registry. We need a way for third party developers to add migrations for state, whether that state is stored as a saved object, in a saved object, or just dynamically in code (consider the case of an embeddable hard coded into an application - how do we warn the user of the deprecation period before we end support completely and they need to migrate by).
Also keep in mind, this can go the other way. Any registry item may want to migrate its state structure, but this can have cascading affects, both up the chain and down. What if a childEmbeddable wants to change it' explicitInput from
{ name }
to{ firstName; lastName }
. Because childEmbeddable state is stored up the chain (in a container, and that container's state stored somewhere, like a dashboard saved object, or hard coded in the SIEM overview page), those objects need to update the state. On the other hand migrations might float downward, for instance if dashboard embeddable wants to change its input state to accept{ filters }
instead of{ timeRange, filters, query }
. All the children embeddable need to update to accept the new version or they will eventually be broken.Another example - think about how much state needs to be migrated if we want to change our Filters shape. Many potential saved objects and also a lot of code. How do we keep all of these in sync so we can support older saved objects that contain references to older versions?
I don't have a good solution to offer yet, though I think this might end up being an extremely generic state migration system that Saved Object migrations could be built on top of. It would be on the fly - only the saved object migration system would touch the Kibana database.
How important is this to tackle now vs later?
Well, we lived without saved object migrations for many years. We also already have lived with this problem with the expression language for awhile. It is going to be a problem when we want to have a stable API for third party developers, but in the short term we can work around this by keeping all teams in sync (e.g. we want to change the shape of Filters, we know all our saved objects and run time places that we need to migrate: expressions, actions, embeddables, etc).
I propose we focus on this in 8.x and I think the platform team should be involved, even if app arch ends up owning this, because of the overlap with saved object migrations.
I think we can continue to move forward with a direct access link service without being blocked on this, and I think we could probably continue to move forward with in place embeddables on a dashboard, even though knowing we may need to make some drastic migrations in order to support a new more robust system. This is by no means a small project though so I doubt there will be any 7.x resources for it.
cc @joshdover @rudolf @timroes @majagrubic
The text was updated successfully, but these errors were encountered: