-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storing and querying CUE in database #1435
Comments
Internally there is a definition of data-only CUE either fully expanded (data) and allowing references only (graph). There is no file extension for these, but you can invoke them by specifying output or input types. For instance,
is really like Similarly
checks that the input has no references or expressions, while
checks that the input has no expressions other than literals and references. How is this different from what you want? Can you give examples of how this is different? |
@mpvl I see I was not clear enough. I have updated the problem statement:
We want to store CUE expressions alongside literals, so when the data is retrieved it can be decoded back to CUE and evaluated. A JSONSchema encoder world allow for the encoding of a subset of expressions, the format I am proposing would allow for encoding all CUE expressions. For example: name: "example"
author: "codewithcheese"
uri: "\(envVars.HOSTNAME):\(envVars.PORT)"
envVars: {
HOSTNAME: string
PORT: number
CONSTANT: 1
} Would be encoded as follows, which can be stored in JSON field and queried for any literal like name: "example"
author: "codewithcheese"
envVars: {
CONSTANT: 1
$$cue: "HOSTNAME: string\nPORT: number"
}
$$cue: "uri: \"\\(envVars.HOSTNAME):\\(envVars.PORT)\"" The result could be decoded and unified with other results that specify values for JSONSchema encoder could encode See |
@codewithcheese what about storing the concrete values which can be queried for in a jsonb column and storing the original CUE source in a second, text column? |
@verdverm that's an option. I think that's a more complex solution than using an encoding. Both options (encoding or separate field) require a change to our data handling, however a separate field also requires a migration. It would also mean the concrete values could get out of sync with the original source if an update was applied to one of the concrete values. Might require some extra logic. We're not likely to apply updates to the concrete values alone but its a potential pitfall down the track. |
The use case is still mostly unclear to me, but I see what your are proposing. The semantics of this seems somewhat peculiar and specific to a certain job. There are also many aspects to consider, which would make this a massive design undertaking, as far as I can tell. Instead, it seems that users can fairly easily writing something themselves that converts an evaluated CUE value to such a format, tailored to their purpose. Alternatively, one could just store CUE as is, and have a small tool to strip any expressions from the AST before evaluation (see ast/astutil). This would just be a few lines of code and be a cleaner alternative, IMO. So unless someone comes up with a good argument that this is a generally useful, with good justifications why a design should be exactly as proposed, and come up with an easy set of principles from which a design flows naturally, this does not seem to be something that belongs in core. |
@mpvl I think there is still a misunderstanding, or more likley there is something about CUE I am totally misunderstanding. I am keen to continue the discussion to get on the same page if that's ok. I am fine for this not to be included in core, we can use it as a separate tool (if it still makes sense). My intention with the PR is to see if it something that is generally useful.
As far as I can see attributes are ignored when exporting to JSON, the purpose is to do the very opposite, to maintain the expressions, as syntax strings, in JSON. That way can they be decoded and evaluated later.
As far as I understand CUE is not "code as data", unlike a Lisp for example. How would you store CUE file containing literals and expressions "as is" in a database so that the literal values may be queried? |
The answer for CUE is probably similar to other unsupported formats like Yaml or Dhall.
|
@verdverm seems you understand what I am getting at, what do you consider to be the pros and cons of encoding to JSON with embedded syntax strings vs 2 column solution? |
@codewithcheese the embedded idea seems arbitrary and hacky imho. I generally agree with @mpvl that something like this would require a lot more flushing out.
I would think there might be something around doing this more generally within a CUE value and then being able to support this in both Yaml and JSON. Still, this would seem better as a library than in core. |
@mpvl it seemed there may of been some misunderstanding of the design I proposed, I tried to clarify a few messages above. Is there anything else you would like to add before we close this issue? |
At balena.io we make extensive use of manifest type files we call contracts, they are a mix of data and schema, and we have an internal database for storing and querying contracts. Contracts are stored in JSONB fields and schema is defined using JSONSchema. I opened a discussion about this see #1250.
We are keen to adopt CUE for contracts for a few reasons. Mixing data and schema with CUE is more natural and much less verbose than JSONSchema. CUE has some nice ergonomic improvements over plain JSON (see #130), like no requirement to quote property names and inline objects with a single property. Also, contracts can be evaluated together to validate compatibility, solve constraints and generate new output formats and CUE has some nice properties for evaluation vs JavaScript with JSON/JSONSchema.
As far as I know there is no way to encode and store complex CUE files include expressions and literals as data, so that concrete values can be queried and resulting files can be decoded back to CUE and expressions evaluated.
CUE to JSONSchema is an option, but it would limit encoding and decoding to what can be expressed as JSONSchema.
I propose an (en|de)coder that can convert any arbitrarily complex CUE file to a plain object format, with concrete data as literals and CUE expressions serialized to string and appended as a special properties for later decoding. Internally I am calling this format CUEdata but I am totally open to any name, such as those mentioned in #130, QSON or CUEL.
We do still have a need for JSONSchema, contracts can also include queries that are defined using a schema, we have a JSONSchema to SQL compiler. I propose an attribute could be used to control how a struct is encoded. When a format attribute is specified; CUE expressions are encoded to that format instead of being serialized to a syntax string.
cc @myitcv @mpvl
The text was updated successfully, but these errors were encountered: