-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add recipe validation #70
Conversation
I'm still reading but I wanted to say thank you right now for the amount of details and the care in explaining them. |
First of all, thank you for this extensive and detailed PR. One thing:
I mean Go is a type safe language so there is some validation there already. But you're right that the input validation is very limited here and doesn't guide the user anywhere. We're currently in preperation of the realease of Vanilla OS so I will take a closer look at this next week. |
One question tho: (I mean I'm guessing yes because of this: |
Hello @taukakao
Yes, the reason I am providing CUE recipes is because the playground does not support YAML input. The
Definitely, this is already a massive win over using something like Python but still the The idea is to offload the validation to the schema so that Go can safely assume the input is a valid recipe, but some safeguards will still be needed until the schema is thoroughly tested.
No problem at all. I wise you all good luck and I hope everything goes smoothly. Also, congratulations on the big release 🎉 |
Since this is hard-coding the module types and schemas, will recipes using custom modules fail validation? |
Hello @xynydev
Yes, the schema right now only covers the built-in modules. The schema can of course be extended to cover any use case. I am already mentioning that I found some module types not documented up on Atlas such as the After browsing the code, the I tried using the docs provided but seems like the The same applies for the Go code modules ( This PR is still a work in progress, I have already found some areas that can be improved. I will keep on updating it as new issues are brought forward, so please keep the discussion going and point out any issues. |
We had planned to document the process of creating custom plugins but then the implementation changed and unfortunately we took too long. |
GOAT PR so far! 💪🏼 |
@lambdaclan I think either the custom modules need to pass their CUE recipe over a standardized API call or they have to validate themselves. |
I'm developing the schema validation for BlueBuild (using TypeSpec under the hood), and solved the problem by using a generic CustomModule model when the module type is not something we have schemas for that accepts any parameters. This allows us to implement recipe validation that will not fail when unknown modules are used. (the build process will fail at a slightly if the module configured doesn't exist) The docstrings |
No problem. I am sure we will figure out how to handle custom modules in the end. After that, it might be a good opportunity to also write the documentation for them.
It is not a matter of limiting how custom modules look like but more about standardizing them. I do not think is a bad idea to provide some sort of guideline of how custom modules should be. For example, they should all have a name, their type should be
There are definitely many ways to handle them. I do not want to make any suggestions based on assumptions so once I have some idea of how custom modules work I should be able to come up with some potential solutions that we can then discuss before deciding. What I really want is some recipes using custom modules. If anyone has any examples please share them 🙏
Very cool. Thank you for the input. CUE also supports the |
I think standardizing them limits them. It would make them second class modules, while they should have the same possibilities and freedoms that official modules have. For now, the only implementation of a custom module that I can think of is https://github.com/Vanilla-OS/vib-fsguard |
Hello Tau. I definitely do not want to do anything that will limit the functionality of custom modules. On the contrary what I am aiming to do is to make them easier to author like with the actual recipe and built-in modules. Having a linter validating your recipes as you write them provides some sort of safety net before you even try to build your custom image. It will also make it easier to detect and fix any issues. This can be further expanded once the CUE LSP is ready. It is still under development, but hopefully it will happen sooner than later. This will essentially allow us to write recipes while having the dynamic checks as shown above in the playground from within our IDEs.
OK this is great! Thank you for sharing it with me. This should be enough to at least start some brainstorming on how to approach custom modules. Once I have some suggestions I will be back to discuss in more detail. |
62a03b5
to
55fe41f
Compare
Updates:includes-module.webmid: "my-image-id"
name: "my-image"
stages: [
{
id: "build"
base: "node:lts"
modules: [
{
name: "example-includes"
type: "includes"
includes: [
"modules/00-vanilla-desktop.yml"
]
}
]
runs: {
workdir: "/app"
commands: [
"npm run build"
]
}
}
] id: my-image-id
name: my-image
stages:
- id: build
base: node:lts
modules:
- name: example-includes
type: includes
includes:
- modules/00-vanilla-desktop.yml
runs:
workdir: /app
commands:
- npm run build
copy-from-checker-improved.webmid: "my-image-id"
name: "my-image"
stages: [
{
id: "build"
base: "node:lts"
copy: [
{
workdir: "/app"
srcdst: {
"/app/example.txt": "."
}
}
]
},
{
id: "start"
base: "node:lts"
copy: [
{
from: "build"
workdir: "/app"
srcdst: {
"/app/example.txt": "."
}
}
]
},
{
id: "deploy"
base: "node:lts"
copy: [
{
from: "start"
workdir: "/app"
srcdst: {
"/app/example.txt": "."
}
}
]
}
] id: my-image-id
name: my-image
stages:
- id: build
base: node:lts
copy:
- workdir: /app
srcdst:
/app/example.txt: .
- id: start
base: node:lts
copy:
- from: build
workdir: /app
srcdst:
/app/example.txt: .
- id: deploy
base: node:lts
copy:
- from: start
workdir: /app
srcdst:
/app/example.txt: . |
QuicktipYou can easilly convert your Vib YAML recipes into cue using the CLI. This will allow you to test things out in the playground if needed. test-validation.ymlname: Chronos Frontend Image
id: chronos-fe
stages:
- id: build
base: node:lts
singlelayer: false
labels:
maintainer: Vanilla OS Contributors
adds:
- workdir: /tmp/user/files
srcdst:
go.mod: .
- workdir: /tmp/otheruser/files
srcdst:
go.mod: .
go.sum: .
cmd:
workdir: /app2
exec:
- python
- prog.py
entrypoint:
workdir: /app
exec:
- npm
- run
- build
runs:
workdir: /etc/apt/apt.conf.d
commands:
- echo 'APT::Install-Recommends "1";' > 01norecommends
expose:
"6090": "6090"
modules:
- name: build-app
type: shell
workdir: /app
source:
type: git
url: https://github.com/Vanilla-OS/chronos-frontend
branch: main
commit: latest
commands:
- mv /sources/build-app .
- npm install
- npm run build
|
apples pkl has built in validation system and can be translated to yaml, would it maybe be easier to instead switch vib to use pkl instead of yaml for the recipes? Could reduce maintenance burden somewhat while having more features |
Hello there, thank you very much for contributing to the discussion! I am not sure if you caught up with the whole description of the PR (I know it is too long) but my current proposal is to use CUE for validation. There are of course other alternatives like Pkl that I do mention could also be considered. My current solution completely covers the recipe validation without the need of switching the recipe format meaning that we can just keep on using YAML and get all the benefits of validation though the schema. Now, I also mention in the PR that if the team is happy with the overall checks and feel that YAML is just too finicky we could switch the recipe format to a different language, a config language that is. As you can expect since I spent all this time working on this, my suggestion would be to use CUE as the new format but only if necessary. For now, until things are thoroughly tested, I suggest that we stick with YAML and open the topic of the recipe language at a later stage. The current PR introduces the lint command, but there are still no checks in the pre-build phase or in any automations like CI etc. Getting those implemented first (once the schema is tested and approved) will provide immediate benefits to all the Vib users. Also, there is no point to do all this work only for the lint command, we need to enable the same checks on build as soon as possible. Switching the recipe language will be quite a big change and even if the team decides to do it some sort of grace period where both formats can be used is possibly needed (thankfully CUE provides a CLI that can convert YAML to CUE so huge win right there - see comment above). I am definitely up for switching the language but when the time is right. For example the CUE language server (under active development) could allow users to dynamically lint their recipes as they write them which would be a really nice experience (similar to the playground demos). I am keeping this PR as draft for two reasons:
I would appreciate it if I can at least get some feedback regarding point 2 because If this is not needed or a different approach is more desirable then no point in spending even more time on this 🫠 Now regarding CUE vs Pkl. By no means I am no config language expert (I began learning CUE for this very PR) but unless Pkl provides some substantial benefits over CUE I do not see why we should switch to Pkl. I did consider Pkl amongst other languages, but CUE seemed to be the most mature one. Also CUE has first class and extensive support for Go lang (in fact CUE is inspired by Go hence the absence of things like classes etc), great tooling (CLI, playground, upcoming Language Server etc) and most importantly allow us to validate the recipe without any issues. When I was creating the schema I never felt CUE is too limiting or something.
Could the same validation be done with Pkl? Definitely and many other config languages for that matter. This is highly subjective and at end of the day the real answer is possibly "it does not really matter". Modern config languages like CUE, Pkl, Dhall etc offer big improvements and benefits over using something like YAML, TOML or JSON. Someone needs to make the final decision but for obvious reasons my vote goes to CUE. Thank you all and keep the discussion going! References:
|
I'm not sure what the point of repeating all this information is, I am aware that your current implementation uses cue, my proposal is to use pkl instead. |
I think it would be best to list the benefits of using CUE here compared to pkl. I'm sure there are some. Then we can better discuss both ideas. |
Apologies for going overboard with the explanations.
The only reason both YAML and CUE are needed is that the scope of this PR is to add validation to the current recipe format which is YAML. The schema is written in CUE and can be used to directly validate the Vib YAML recipe. If we were to assume that the Vib recipes will also be written in CUE then like Pkl only one language will be needed.
Yes, of course I am aware of that. It just seems to be a bit OOP oriented but nothing wrong with that.
CUE also has support for packages and modules.
CUE can also have remote repositories for modules.
To be honest I would not say one language is better than the other one. They are both great, they are both trying to solve the same problems albeit a bit differently due to differences in philosophy. I do not feel either language provides groundbreaking features that set it apart. Both can get the job done. Generally speaking, using a config language is infinitely better than using a static config format. Also keep in mind CUE and Pkl are not the only great options out there 😊
I will not do any more work on this until we reach a final decision. If you decide to proceed with CUE I will carry on and adjust as needed after the feedback. If you decide to go with Pkl or any other solution we can just close this PR and move on. Even if CUE is not selected I will be more than happy to help If no one is willing to take the task or support is required. |
which would make it better fit for this as the vib recipes are also based on objects and classes
from what I'm reading this seems to require a lot of extra work than pkls module support, since cue requires some registry thing and all that
that is true, but one would be better fit than the other one for this job
I'd rather switch to a different config language altogether which would allow validation than having to settle with a two language solution, since my general plan with vib is to move it to a better fit config language. If that language ends up being CUE then we can just "repurpose" this pr to switch vib to cue. |
Do we have any updates on the state of this PR? Conflicting files are already popping up, so please let me know how you would like to continue going forwards when you get a chance 🙏 I am assuming since it is holiday season some people might be away, if that is the case no worries. Thank you! |
I still find it better suited to directly move to pkl here |
After reading the entire discussion, I lean towards PKL. If PKL already supports features that would require re-implementation in CUE, it makes sense to consider it now. Anyway, it's crucial to maintain YAML as the default format, as it's what keeps Vib simple. Advanced configuration languages like PKL can be optional for those who need them. |
No problem at all. Better make the hard decisions early on in the process.
I guess this means that we should have a schema to do the linting via PKL instead of CUE while keeping the standard format as YAML. We could make it possible to accept either format (YAML or PKL) for the recipes but just assume the default is YAML. @axtloss Just to clarify, are you going to take over this task? I will be happy to adjust the work done in this PR to use PKL instead of CUE (there bound to be some similarities) to implement the lint command as a first step. We can then proceed to add support for PKL based recipes along with any other requirements. Maybe we should create a new issue/task list to keep track of things. For now just let me know if you would like me to help out with this! Thank you both. |
sorry for not responding to this earlier |
No worries, I myself got busy with work lately.
Sounds good, I will reply on the ticket so if you want you can assign me to it. I will start working on it as soon as possible and open a PR early so that we can discuss further. |
Add recipe file validation
vib test
command tovib lint
. For example, any scripts or CI automations usingvib test
will need to be updated accordingly.Description
This PR adds a new
lint
command for validating Vib recipe files.Why?
The
lint
command offers extensive validation that covers the following:Structural validation: This ensures that recipe files contain only valid keys. This covers root level keys such as the recipe
id
as well as nested keys such as stageid
stage module keys and so on. The relevant recipe, stage and source fields are fully checked. The schema used during the validation process is basically a representation of the data defined in the structs.go module. This means that all validated recipes will always generate a valid container file (buildable by Vib) since the structure defines both required and optional fields.Content validation: This ensures that the previously validated keys contain valid values. Depending on how strict we want the validation to be this can be further extended but for now it mostly ensures that string content is neither null (
id: null
/id: ~
/id:
) or empty (id: ""
). As a proof of concept some other niceties (semantic validation) have also been added:expose:{} / expose:~
).The validation is handled by a schema file written in CUE (cuelang). According to the official documentation:
CUE excels in data validation. It has native support for validating YAML and even native Go packages therefore it can easily be used with Vib recipes and Vib's codebase. CUE enables us to define a schema file which is something similar to YAML schemas but way more versatile and easier to write. Ordinary YAML schema is actually an extension of JSON schema and although it can be used for validation purposes after experimenting with it, I found CUE to be more suitable both in terms of usability and flexibility. CUE is a superset of JSON so it can also work directly with JSON schemas if for some reason that approach is more desirable.
What CUE can do for us is to enable some form of a static type system for the Vib recipe files and also verify if the data is semantically correct. Pairing that with the great tooling and Go integration I feel that CUE is good match for our use case.
Is CUE the best/most suitable solution? This is subjective, so I cannot answer, but you can see what CUE can do in the examples provided in this PR. Plenty of references have been provided in terms of alternative approaches, so please feel free to discuss.
Regardless, I feel that using some sort of schema based approach (whatever that might be) is more effective than parsing the recipe YAML file and doing the checks in a manual fashion (validation logic boilerplate reduction). Keeping the data representation separate from the code base will enable validation checker updates without the need to write any Go code. Furthermore, CUE as a data language is easier to reason with compared to Go therefore non programmers will also be able to update the schema if needed. The CUE playground (showcased in all the samples below) is a great way of testing the validation schema against Vib recipes as it provides real time output enabling experimentation. Alternatively the cue CLI can be used for local testing.
Proposed Changes
test
command into the newlint
command which offers extensive recipe validation.CUE Playground vs vib lint
All the examples listed below are reproducible and can be tested in the CUE playground (links provided). The CUE playground currently does not support YAML input therefore Vib recipes are provided in CUE format and if the validation checks succeed the output is converted into YAML. Until you have a better understanding of CUE while in the playground please refrain from changing the schema definition. The recipe can be adjusted on the "INPUT RECIPE BELOW" section (see reference below).
The
vib lint
command is taking the raw recipe YAML file as input and checking it against the schema. The schema content is the same one used in the playground links (the newly added recipe.cue file).Examples
Each example provides:
1. Generating the minimal valid recipe
Kooha-2024-07-26-17-20-39.webm
CUE recipe
YAML recipe
vib lint
2. Unique recipe id and name
CUE recipe
YAML recipe
vib lint
3. No null values allowed
CUE recipe
YAML recipe
vib lint
4. No empty values allowed
CUE recipe
YAML recipe
vib lint
5. No unknown keys allowed
CUE recipe
YAML recipe
vib lint
6. No empty mappings allowed
CUE recipe
YAML recipe
vib lint
7. No empty lists allowed
CUE recipe
YAML recipe
vib lint
8. Unique stage ids
CUE recipe
YAML recipe
vib lint
9. Unique recipe id
CUE recipe
YAML recipe
vib lint
10. Unique stage level module names
CUE recipe
YAML recipe
vib lint
11. Unique recipe level module names
CUE recipe
YAML recipe
vib lint
12. Unique recipe level names
CUE recipe
YAML recipe
vib lint
13. Copy from stage checker
CUE recipe
YAML recipe
vib lint
14. Working with modules
working-with-modules.webm
CUE recipe
YAML recipe
vib lint
Fast iteration testing and schema prototyping
Note
The example recipe is using the old copy format
Schema changes require application rebuilds in order to be tested which can be time-consuming. Apart from using the playground as showcased in the examples above the
cue CLI
can also be used.Using the following
test-validation.yml
recipe (based on the original chronos-fe recipe):here is how
cue cli
can be used for testing the schema:core
directory in the temporary directoryOpen points
Even though I did the best I could to cover all possible use cases when writing the recipe schema there is always a chance that I might have missed some parts. As such more rigorous testing is needed, for example we could test the schema against recipes from Atlas. Since most of them are still using the old format, the linter can help towards their re-write while at the same time testing the schema itself.
Furthermore, seems like some module types are not documented? After browsing recipes up on Atlas I found some things missing which are also absent from the schema:
There is also the repo key at the root level of the recipe.
Suggestions for next steps
Provided that this proposal is accepted, here are some things to consider for later on:
References
Important
EDIT 1
I just realized, the videos (webm) did not render correctly. Seems like my recordings got cut half way through.. Apologies for that. I will try again but even the existing videos should still give you an idea.
Important
EDIT 2
Recordings should now be fixed.