Method to Synchronize Schema Versions with Git #346
Replies: 2 comments · 4 replies
-
Hi @jakeyheath What if we let you specify a model Name or ID when writing the model? e.g.
And provide a way to query by name
You'll need to query the model ID based on the name. Would that help? |
Beta Was this translation helpful? Give feedback.
All reactions
-
Firstly being able to grab a model by a name string would be an improvement over what we have now, but without the same functionality on stores you still have some uniqueness in the system that prevents multi environment workflows. Secondly it's worth having a quick evaluation of how OpenFGA wants to deal with what is essentially metadata on the objects it stores. MetadataDepending on appetite there's all sorts of approaches OpenFGA can take with metadata. So here's some examples pulled from prior art I've worked with before. KubernetesI'm quite biased as a platform engineer in a Kubernetes context, but the way Kubernetes objects handle metadata appeals here. Just thinking out loud here on designs and use-cases and if there's justifications because any additional complexity has to be paid for. {
"metadata": {
"name": "77d4c98a51ca75c50d714f10b6fa7c8ee91202ac",
"labels": {
"gitHash": "dee1e34",
"version": "1.3.2"
},
"annotations": {
"gitRepo": "https://github.com/example/repo"
}
},
"type_definitions": [
{ ... }
]
} The rationale here would be a name is something you can select directly, akin to the query you have where You could potentially then log out the annotations to stdout/stderr so when a client finishes initialising there's a log of useful metadata for operations people to be able to track. Additionally labels could be useful for selection of models, for example OCIAway from the Kubernetes flavoured examples, we could take inspiration from OCI containers. {
"tags": [
"1.3.2",
"dee1e34"
],
"type_definitions": [
{ ... }
]
} In which case you could query on the tag values directly e.g. With immutable tags you'd be tying tags to models, once and only once, with mutable tags you'd have a database table where in you just move the pointer to the model when a tag is reused. SemVerYou could ditch arbitrary tags for a highly opinionated field using SemVer: {
"version": "1.3.2",
"type_definitions": [
{ ... }
]
} The rationale here would be for auto promotions of versions, within semantic version range clamping. An example would look like so - NameAs proposed already above this reply, just having a simple string alias on a model in the database offers a lot of flexibility in what is stored at the cost of less flexibility in the query capabilities. Since it's a free form string, you can store SemVer, CalVer, integer versions or git hashes in this field and use it as an alias for change promotion. You'd be unable to do comparisons or ranges, or sorting easily but for being able to select a model using a string that could be passed between environments, this would fit the bill. Note on IdempotencyIt might be worth considering storing a hash of the entire model as metadata, so submission of the same model with an identical model payload can either issue a soft warning it's a duplicate, or prevent submission and issue an appropriate error. I can't think of a use case for having multiple identical models available with different model IDs but this might be by design so open to hearing any. |
Beta Was this translation helpful? Give feedback.
All reactions
-
RationaleSo to add some rationale to why this is important to me as a use-case I'll document where this has been most painful as a functionality gap. Let's take a standard environment setup before moving onto the pitfalls and problems:
Problem Case
As with other runtimes (Lambda, ECS, etc), Kubernetes makes environment variables immutable in the deployment manifests - and changing them triggers a re-deployment of the pod. This is actually ideal behaviour in this scenario as I'd like a service to restart and re-evaluate the As we already know, if you publish a model (or create a store) you get a unique identifier back (including on repeated submission of the same model because it's immutable but not idempotent). So from a CI/CD perspective you have a bit of a problem case with trade-offs. Multiple OpenFGA InstancesYou deploy the same model to multiple OpenFGAs (and databases behind them), a pipeline that pushes to development, staging and production in one go so that a single git hash is the progenitor of the model. Pros
Cons
Regardless of the pros and cons the ultimate issue is that you can't "promote" state (and state in this context means the FGA_MODEL_ID) between environments, so you end up needing to keep track of the model version in 3 environments, and this model version doesn't have a representation in the source to compare - so if you lose any metadata tying them together you'll be relying on manually pairing up a model to its source representation if you need to debug things going wrong. This is especially pertinent when you're doing local testing and you want to roll back to an older model and find out when things went wrong between a working known state and now. Note Since the latest model specs allow you to put comments in the model, you could abuse them to put some metadata into the source code but then you'd be on the hook for maintaining a system that embeds a git tag into the comments of the model for quick correlation between a model and the source. Single OpenFGA InstanceOn the flip side you could run a single OpenFGA instance to service your environments, akin to how you'd run a single Flagsmith instance (which itself has an internal data model and representation of environments). Pros
Cons
SummaryAll of this boils down to toil and safety in promotion of changes between environments. However we achieve that; be it metadata, name field, idempotency (via a model hash field) is less important to tackling the underlying problem of being confident when an application starts it'll start with the model the developer expected it to. I've tried both options, both felt like the cons were particularly concerning enough to end up at this discussion with these notes. Wishlist of Preferred OutcomesWhat I'd actually like to be able to do:
As a future gazing wishlist that might be a big step too far but here's some thoughts anyway:
And finally, the above are all musings and throwing ideas out there rather than fully fleshed out feature requests - I'll take a name field on Store and Model objects to resolve the issue in the short term if that expedites processes. |
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 2
-
@aaguiarz I think that would help a lot yes. It definitely handles my use case. For my use case, the developers are already querying the openFGA server with an authorization_model_id, so it wouldn't be much of a stretch to ask them to query it using the name instead of the ID. This would allow us to name the schema version and associate it with the git branch or commit SHA and instruct users they should always use the git SHA to query for the schema. As a request to this, I think it might be nice to have several names associated with a schema version rather than just one. This is similar to the above example of using container tags. That way we should tag with the git sha, but also the git branch or another name. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks @danielloader for the context! |
Beta Was this translation helpful? Give feedback.
All reactions
-
Was a hell of a morning, the coffee hit just right. |
Beta Was this translation helpful? Give feedback.
-
We are an organization that has heavily adopted gitops-driven workflows. All our applications use gitops to check in configuration of their deployed applications. Every application can be directly correlated to git commit. Ideally, we'd like a way to synchronize our
authorization_model_id
values with git so that we can use this same development practice to inform applications what version of the schema they are on. If we ever needed to debug a particular schema version, we could go back to that commit and see how the schema was defined.Right now,
authorization_model_id
's are randomly generated after the schema is submitted. It would be nice to have a flag or option to set the value of theauthorization_model_id
(i.e.--set-id <value>
). That way we could pass in the git SHA when submitting schema migrations and we could configure our application to use a git SHA value to use that version of the schema during deployment. We have a very similar approach to Docker build tags.Here are some solutions we've thought of implementing to work around this:
authorization_model_id
to the git repoauthorization_model_id
to git SHA and magically inject this into applicationsauthorization_model_id
's (we would love to have a paved road here, so we aren't keen on this solution as it puts a lot of burden on the development teams to do this properly)Related discussions
Beta Was this translation helpful? Give feedback.
All reactions