Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add recipe validation #70

Closed
wants to merge 17 commits into from

Conversation

lambdaclan
Copy link
Contributor

@lambdaclan lambdaclan commented Jul 26, 2024

Add recipe file validation

⚠️NOTE⚠️

  • I am marking this PR as a draft since a lot of things require feedback from the upstream developers.
  • The proposed changes might introduce breaking changes as they rename the existing vib test command to vib lint. For example, any scripts or CI automations using vib test will need to be updated accordingly.
  • I will be happy to do any necessary adjustments to be in line with the upstream developers' vision of this feature.
  • The recipe schema file is mostly complete and to my knowledge accurately describes the recipe structure. Saying that, some modifications might be needed in case any of the built-in modules or general commands are deprecated or subject to change.
  • The proposed solution utilizes CUE (or "cuelang" for searchability) as a mechanism for the validation process. I personally feel CUE is a great tool which greatly simplifies the validation task (note the minimal required code changes) while offering great flexibility in terms of describing data. Saying that, I do not want to impose anything on the project, so if upstream feels a pure Go solution or a different tool/configuration language would be more favourable we can pursue a different solution (see below).
  • Please be aware that there are other alternative solutions used for validation that could also be considered:
  • You can read comparisons between CUE and some of the other solutions here and here.

Description

This PR adds a new lint command for validating Vib recipe files.

Why?

YAML's data model can represent arbitrary data, so when an app parses a YAML document it might get back anything. It's up to the developer to check that the data is structured how the app expects and to control what happens when it isn't. Does it report an error to the user? Is the behaviour undefined? Does it crash?

The lint command offers extensive validation that covers the following:

  • Structural validation: This ensures that recipe files contain only valid keys. This covers root level keys such as the recipe id as well as nested keys such as stage id stage module keys and so on. The relevant recipe, stage and source fields are fully checked. The schema used during the validation process is basically a representation of the data defined in the structs.go module. This means that all validated recipes will always generate a valid container file (buildable by Vib) since the structure defines both required and optional fields.

  • Content validation: This ensures that the previously validated keys contain valid values. Depending on how strict we want the validation to be this can be further extended but for now it mostly ensures that string content is neither null (id: null/ id: ~ / id: ) or empty (id: ""). As a proof of concept some other niceties (semantic validation) have also been added:

    • Ensure recipe ids are unique. This means that no stage id can have the same name as the recipe id.
    • Ensure stage ids are unique.
    • Ensure the recipe name is unique. This means that no module name can have the same name as the recipe name.
    • Ensure module names are unique (per stage but also globally).
    • Ensure recipe name and recipe id are not the same.
    • Ensure copy froms are valid. This means that when using the from functionality of the copy command, the from value must actually refer to a valid/existing stage id.
    • Ensure apt/dnf module path installations use *.inst files
    • Ensure dpkg module path installations use *.deb files
    • Ensure list (array) based fields such as stages, modules, adds, copies etc have at least one item (if declared unless required) as specified by the schema (size constraints).
    • Ensure map like items such as labels, expose, args, copy/add srcdst etc have at least one item (if declared unless required) as specified by the schema (size constraints --> expose:{} / expose:~).
    • Enforce type constraints for modules and null checks for all values.

The validation is handled by a schema file written in CUE (cuelang). According to the official documentation:

CUE is an open source language with a rich set of APIs and tooling for defining, generating, and validating all kinds of data.

CUE excels in data validation. It has native support for validating YAML and even native Go packages therefore it can easily be used with Vib recipes and Vib's codebase. CUE enables us to define a schema file which is something similar to YAML schemas but way more versatile and easier to write. Ordinary YAML schema is actually an extension of JSON schema and although it can be used for validation purposes after experimenting with it, I found CUE to be more suitable both in terms of usability and flexibility. CUE is a superset of JSON so it can also work directly with JSON schemas if for some reason that approach is more desirable.

What CUE can do for us is to enable some form of a static type system for the Vib recipe files and also verify if the data is semantically correct. Pairing that with the great tooling and Go integration I feel that CUE is good match for our use case.

Is CUE the best/most suitable solution? This is subjective, so I cannot answer, but you can see what CUE can do in the examples provided in this PR. Plenty of references have been provided in terms of alternative approaches, so please feel free to discuss.

Regardless, I feel that using some sort of schema based approach (whatever that might be) is more effective than parsing the recipe YAML file and doing the checks in a manual fashion (validation logic boilerplate reduction). Keeping the data representation separate from the code base will enable validation checker updates without the need to write any Go code. Furthermore, CUE as a data language is easier to reason with compared to Go therefore non programmers will also be able to update the schema if needed. The CUE playground (showcased in all the samples below) is a great way of testing the validation schema against Vib recipes as it provides real time output enabling experimentation. Alternatively the cue CLI can be used for local testing.

Proposed Changes

  • Revamp the existing Vib test command into the new lint command which offers extensive recipe validation.
  • Add Vib recipe schema file.
  • Implement linter functionality using the recipe schema file.
  • Update project dependencies (cuelang).
  • Update Vib CLI structure.
  • Add CUE playground template file that can be used for experimentation (play.cue).
  • Fix doc formatting with prettier and mardownlint.
  • Fix formatting with gofumpt and goimportx.

CUE Playground vs vib lint

All the examples listed below are reproducible and can be tested in the CUE playground (links provided). The CUE playground currently does not support YAML input therefore Vib recipes are provided in CUE format and if the validation checks succeed the output is converted into YAML. Until you have a better understanding of CUE while in the playground please refrain from changing the schema definition. The recipe can be adjusted on the "INPUT RECIPE BELOW" section (see reference below).

cue-playground-process

cue-playground-ui

The vib lint command is taking the raw recipe YAML file as input and checking it against the schema. The schema content is the same one used in the playground links (the newly added recipe.cue file).

test-validation-input

vib lint test-validation.yml

test-validation-output

Examples

Each example provides:

  • Link to the playground
  • Notes
  • Vib recipe in CUE format
  • Vib recipe in YAML format
  • vib lint command output

1. Generating the minimal valid recipe

  • Note how the schema is guiding the user towards a valid recipe.
Kooha-2024-07-26-17-20-39.webm
CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
vib lint

test-validation-output-example-1

2. Unique recipe id and name

CUE recipe
id: "my-image"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
  }
]
YAML recipe
id: my-image
name: my-image
stages:
  - id: build
    base: node:lts
vib lint

test-validation-output-example-2

3. No null values allowed

CUE recipe
id: null
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
  }
]
YAML recipe
id: null
name: my-image
stages:
  - id: build
    base: node:lts
vib lint

test-validation-output-example-3

4. No empty values allowed

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: ""
    base: "node:lts"
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: ""
    base: node:lts
vib lint

test-validation-output-example-4

5. No unknown keys allowed

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    version: 1
    base: "node:lts"
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    version: 1
    base: node:lts
vib lint

test-validation-output-example-5

6. No empty mappings allowed

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    expose: {}
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    expose: {}
vib lint

test-validation-output-example-6

7. No empty lists allowed

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    runs: {
      commands: []
    }
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    commands: []
vib lint

test-validation-output-example-7

8. Unique stage ids

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  },
  {
    id: "build"
    base: "node:lts"
    runs: {
      workdir: "/app"
      commands: [
        "npm run start"
      ]
    }
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    runs:
      workdir: /app
      commands:
        - npm run build
  - id: build
    base: node:lts
    runs:
      workdir: /app
      commands:
        - npm run start
vib lint

test-validation-output-example-8

9. Unique recipe id

CUE recipe
id: "start"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  },
  {
    id: "start"
    base: "node:lts"
    runs: {
      workdir: "/app"
      commands: [
        "npm run start"
      ]
    }
  }
]
YAML recipe
id: start
name: my-image
stages:
  - id: build
    base: node:lts
    runs:
      workdir: /app
      commands:
        - npm run build
  - id: start
    base: node:lts
    runs:
      workdir: /app
      commands:
        - npm run start
vib lint

test-validation-output-example-9

10. Unique stage level module names

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    modules: [
	{
		name: "my-build-apt-module"
		type: "apt"
		source: {
			"packages": [
				"curl",
			]
		}
		"options": {
			fixMissing: true
		}
	},
	{
		name: "my-build-cmake-module"
		type: "cmake"
              buildflags: "-DCMAKE_BUILD_TYPE=Release"
		source:
		{
			"type": "tar"
			"url":  "https://example.com/example-project.tar.gz"
		}
	},
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  },
  {
    id: "start"
    base: "node:lts"
    modules: [
	{
		name: "my-start-apt-module"
		type: "apt"
		source: {
			"packages": [
				"curl",
			]
		}
		"options": {
			fixMissing: true
		}
	},
	{
		name: "my-start-apt-module"
		type: "cmake"
              buildflags: "-DCMAKE_BUILD_TYPE=Release"
		source:
		{
			"type": "tar"
			"url":  "https://example.com/example-project.tar.gz"
		}
	},
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run start"
      ]
    }
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    modules:
      - name: my-build-apt-module
        type: apt
        source:
          packages:
            - curl
        options:
          fixMissing: true
      - name: my-build-cmake-module
        type: cmake
        buildflags: -DCMAKE_BUILD_TYPE=Release
        source:
          type: tar
          url: https://example.com/example-project.tar.gz
    runs:
      workdir: /app
      commands:
        - npm run build
  - id: start
    base: node:lts
    modules:
      - name: my-start-apt-module
        type: apt
        source:
          packages:
            - curl
        options:
          fixMissing: true
      - name: my-start-apt-module
        type: cmake
        buildflags: -DCMAKE_BUILD_TYPE=Release
        source:
          type: tar
          url: https://example.com/example-project.tar.gz
    runs:
      workdir: /app
      commands:
        - npm run start
vib lint

test-validation-output-example-10

11. Unique recipe level module names

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    modules: [
	{
		name: "my-build-apt-module"
		type: "apt"
		source: {
			"packages": [
				"curl",
			]
		}
		"options": {
			fixMissing: true
		}
	},
	{
		name: "my-build-cmake-module"
		type: "cmake"
              buildflags: "-DCMAKE_BUILD_TYPE=Release"
		source:
		{
			"type": "tar"
			"url":  "https://example.com/example-project.tar.gz"
		}
	},
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  },
  {
    id: "start"
    base: "node:lts"
    modules: [
	{
		name: "my-start-apt-module"
		type: "apt"
		source: {
			"packages": [
				"curl",
			]
		}
		"options": {
			fixMissing: true
		}
	},
	{
		name: "my-build-cmake-module"
		type: "cmake"
              buildflags: "-DCMAKE_BUILD_TYPE=Release"
		source:
		{
			"type": "tar"
			"url":  "https://example.com/example-project.tar.gz"
		}
	},
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run start"
      ]
    }
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    modules:
      - name: my-build-apt-module
        type: apt
        source:
          packages:
            - curl
        options:
          fixMissing: true
      - name: my-build-cmake-module
        type: cmake
        buildflags: -DCMAKE_BUILD_TYPE=Release
        source:
          type: tar
          url: https://example.com/example-project.tar.gz
    runs:
      workdir: /app
      commands:
        - npm run build
  - id: start
    base: node:lts
    modules:
      - name: my-start-apt-module
        type: apt
        source:
          packages:
            - curl
        options:
          fixMissing: true
      - name: my-build-cmake-module
        type: cmake
        buildflags: -DCMAKE_BUILD_TYPE=Release
        source:
          type: tar
          url: https://example.com/example-project.tar.gz
    runs:
      workdir: /app
      commands:
        - npm run start
vib lint

test-validation-output-example-11

12. Unique recipe level names

CUE recipe
id: "my-image-id"
name: "my-start-apt-module"
stages: [
  {
    id: "build"
    base: "node:lts"
    modules: [
	{
		name: "my-build-apt-module"
		type: "apt"
		source: {
			"packages": [
				"curl",
			]
		}
		"options": {
			fixMissing: true
		}
	},
	{
		name: "my-build-cmake-module"
		type: "cmake"
              buildflags: "-DCMAKE_BUILD_TYPE=Release"
		source:
		{
			"type": "tar"
			"url":  "https://example.com/example-project.tar.gz"
		}
	},
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  },
  {
    id: "start"
    base: "node:lts"
    modules: [
	{
		name: "my-start-apt-module"
		type: "apt"
		source: {
			"packages": [
				"curl",
			]
		}
		"options": {
			fixMissing: true
		}
	},
	{
		name: "my-start-cmake-module"
		type: "cmake"
              buildflags: "-DCMAKE_BUILD_TYPE=Release"
		source:
		{
			"type": "tar"
			"url":  "https://example.com/example-project.tar.gz"
		}
	},
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run start"
      ]
    }
  }
]
YAML recipe
id: my-image-id
name: my-start-apt-module
stages:
  - id: build
    base: node:lts
    modules:
      - name: my-build-apt-module
        type: apt
        source:
          packages:
            - curl
        options:
          fixMissing: true
      - name: my-build-cmake-module
        type: cmake
        buildflags: -DCMAKE_BUILD_TYPE=Release
        source:
          type: tar
          url: https://example.com/example-project.tar.gz
    runs:
      workdir: /app
      commands:
        - npm run build
  - id: start
    base: node:lts
    modules:
      - name: my-start-apt-module
        type: apt
        source:
          packages:
            - curl
        options:
          fixMissing: true
      - name: my-start-cmake-module
        type: cmake
        buildflags: -DCMAKE_BUILD_TYPE=Release
        source:
          type: tar
          url: https://example.com/example-project.tar.gz
    runs:
      workdir: /app
      commands:
        - npm run start
vib lint

test-validation-output-example-12

13. Copy from stage checker

CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  },
  {
    id: "start"
    base: "node:lts"
    copy: [
      {
        from: "wrong"
        workdir: "/app"
        srcdst: {
          "/app/example.txt": "."
        }
      }
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run start"
      ]
    }
  }
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    runs:
      workdir: /app
      commands:
        - npm run build
  - id: start
    base: node:lts
    copy:
      - from: wrong
        workdir: /app
        srcdst:
          /app/example.txt: .
    runs:
      workdir: /app
      commands:
        - npm run start
vib lint

test-validation-output-example-13

14. Working with modules

  • Note how the schema is guiding the user towards valid module usage.
working-with-modules.webm
CUE recipe
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    modules: [
      {
        name: "example-shell"
        type: "shell"
        workdir: "/app"
        commands: [
          "npm run build"
        ]
      },
      {
        name: "apx-gui"
        type: "meson"
        source: {
          type: "git"
          url: "https://github.com/Vanilla-OS/apx-gui"
          branch: "main"
          commit: "latest"
          //tag: "not-allowed"
        }
      }
    ]
  },
]
YAML recipe
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    modules:
      - name: example-shell
        type: shell
        workdir: /app
        commands:
          - npm run build
      - name: apx-gui
        type: meson
        source:
          type: git
          url: https://github.com/Vanilla-OS/apx-gui
          branch: main
          commit: latest
vib lint

test-validation-output-example-14

Fast iteration testing and schema prototyping

Note

The example recipe is using the old copy format

Schema changes require application rebuilds in order to be tested which can be time-consuming. Apart from using the playground as showcased in the examples above the cue CLI can also be used.

Using the following test-validation.yml recipe (based on the original chronos-fe recipe):

name: Chronos Frontend Image
id: chronos-fe
stages:
  - id: build
    base: node:lts
    singlelayer: false
    labels:
      maintainer: Vanilla OS Contributors
    adds:
      - workdir: /tmp/user/files
        srcdst:
          go.mod: .
      - workdir: /tmp/otheruser/files
        srcdst:
          go.mod: .
          go.sum: .
    copy:
      - workdir: /tmp
        from: build
        paths:
          - src: /app/awesome.txt
            dst: .
          - src: /tmp/test.txt
            dst: .
      - workdir: /etc
        paths:
          - src: /app/hello.txt
            dst: .
    cmd:
      workdir: /app2
      exec:
        - python
        - prog.py
    entrypoint:
      workdir: /app
      exec:
        - npm
        - run
        - build
    runs:
      workdir: /etc/apt/apt.conf.d
      commands:
        - echo 'APT::Install-Recommends "1";' > 01norecommends
    expose:
      "6090": ""
    modules:
      - name: build-app
        type: shell
        workdir: /app
        source:
          type: git
          url: https://github.com/Vanilla-OS/chronos-frontend
          branch: main
          commit: latest
        commands:
          - mv /sources/build-app .
          - npm install
          - npm run build

here is how cue cli can be used for testing the schema:

  1. Create a temporary directory
  2. Copy recipe.cue schema file from core directory in the temporary directory
  3. Modify schema file as desired
  4. Add some test recipes (YAML format)
  5. Test the recipe schema against some recipes

cue vet recipe.cue my-test-recipe.yml

cue-vet

Open points

Even though I did the best I could to cover all possible use cases when writing the recipe schema there is always a chance that I might have missed some parts. As such more rigorous testing is needed, for example we could test the schema against recipes from Atlas. Since most of them are still using the old format, the linter can help towards their re-write while at the same time testing the schema itself.

Furthermore, seems like some module types are not documented? After browsing recipes up on Atlas I found some things missing which are also absent from the schema:

  • include module type
  • fsguard module type

There is also the repo key at the root level of the recipe.

Suggestions for next steps

Provided that this proposal is accepted, here are some things to consider for later on:

  • Refine and improve the CUE schema (feedback required).
  • Improve error handling with more descriptive error messages.
  • Integrate the lint process in the build task (lint recipe before each build).
  • Incorporate the lint process in CI maybe for example before uploading a recipe to ATLAS.
  • Update modules on Go side from a generic interface to more concrete types (similar to what the schema does).
  • If you really like CUE then we could change recipe format from YAML to CUE.

References

Important

EDIT 1
I just realized, the videos (webm) did not render correctly. Seems like my recordings got cut half way through.. Apologies for that. I will try again but even the existing videos should still give you an idea.

Important

EDIT 2
Recordings should now be fixed.

@mirkobrombin
Copy link
Member

I'm still reading but I wanted to say thank you right now for the amount of details and the care in explaining them.

@taukakao
Copy link
Member

First of all, thank you for this extensive and detailed PR.

One thing:

YAML's data model can represent arbitrary data, so when an app parses a YAML document it might get back anything. It's up to the developer to check that the data is structured how the app expects and to control what happens when it isn't. Does it report an error to the user? Is the behaviour undefined? Does it crash?

I mean Go is a type safe language so there is some validation there already.

But you're right that the input validation is very limited here and doesn't guide the user anywhere.

We're currently in preperation of the realease of Vanilla OS so I will take a closer look at this next week.

@taukakao
Copy link
Member

taukakao commented Jul 26, 2024

One question tho:
Can this approach verify regular yaml recipies without writing a cue file?

(I mean I'm guessing yes because of this: vib lint test-validation.yml, but I want to make sure)

@lambdaclan
Copy link
Contributor Author

Hello @taukakao

Can this approach verify regular yaml recipies without writing a cue file?

Yes, the reason I am providing CUE recipes is because the playground does not support YAML input. The vib lint command uses the same schema file internally to directly validate recipe files in the standard YAML format.

I mean Go is a type safe language so there is some validation there already.

Definitely, this is already a massive win over using something like Python but still the vib lint command is much stricter. Of course more validations can also be enforced through Go but using an external schema is easier and more flexible. You can see in my "future suggestions" section I propose to better define modules for example so that we have concrete types.

The idea is to offload the validation to the schema so that Go can safely assume the input is a valid recipe, but some safeguards will still be needed until the schema is thoroughly tested.

We're currently in preperation of the realease of Vanilla OS so I will take a closer look at this next week.

No problem at all. I wise you all good luck and I hope everything goes smoothly. Also, congratulations on the big release 🎉

@xynydev
Copy link

xynydev commented Jul 26, 2024

Since this is hard-coding the module types and schemas, will recipes using custom modules fail validation?

@lambdaclan
Copy link
Contributor Author

Hello @xynydev

Since this is hard-coding the module types and schemas, will recipes using custom modules fail validation?

Yes, the schema right now only covers the built-in modules. The schema can of course be extended to cover any use case. I am already mentioning that I found some module types not documented up on Atlas such as the include and fsguard types.

After browsing the code, the include module seems to be a built-in module, so I will update the schema to cover it. Now in terms of custom modules I will need some more information before deciding how to add them to the schema. Specifically things like what is their use case, their overall structure, what options can they have etc. If someone can point me to some examples it would be appreciated.

I tried using the docs provided but seems like the making a plugin section is not available. For now, I will browse the existing plugins code and see if I can get an idea.

image

The same applies for the Go code modules ( []interface{} ) but having modules being any basically open-ended where anything goes can lead up to problems. I feel we need some form of a structure. For example, we can have concrete types for the built-in modules and then some sort of custom type module where each custom module must follow the same structure. Parts of that can be dynamic since each module will be different I guess.

This PR is still a work in progress, I have already found some areas that can be improved. I will keep on updating it as new issues are brought forward, so please keep the discussion going and point out any issues.

@mirkobrombin
Copy link
Member

I tried using the docs provided but seems like the making a plugin section is not available. For now, I will browse the existing plugins code and see if I can get an idea.

We had planned to document the process of creating custom plugins but then the implementation changed and unfortunately we took too long.

@pietrodicaprio
Copy link
Member

GOAT PR so far! 💪🏼

@taukakao
Copy link
Member

@lambdaclan
Custom modules should be as free as possible in my opinion.
I really don't like the idea of limiting what they can look like.

I think either the custom modules need to pass their CUE recipe over a standardized API call or they have to validate themselves.

@xynydev
Copy link

xynydev commented Jul 27, 2024

Yes, the schema right now only covers the built-in modules. The schema can of course be extended to cover any use case.

I'm developing the schema validation for BlueBuild (using TypeSpec under the hood), and solved the problem by using a generic CustomModule model when the module type is not something we have schemas for that accepts any parameters. This allows us to implement recipe validation that will not fail when unknown modules are used. (the build process will fail at a slightly if the module configured doesn't exist) The docstrings /** */ are shown by the code editor of users with a YAML TSP when hovering over the custom module configuration.

@lambdaclan
Copy link
Contributor Author

I tried using the docs provided but seems like the making a plugin section is not available. For now, I will browse the existing plugins code and see if I can get an idea.

We had planned to document the process of creating custom plugins but then the implementation changed and unfortunately we took too long.

No problem. I am sure we will figure out how to handle custom modules in the end. After that, it might be a good opportunity to also write the documentation for them.

@lambdaclan Custom modules should be as free as possible in my opinion. I really don't like the idea of limiting what they can look like.

It is not a matter of limiting how custom modules look like but more about standardizing them. I do not think is a bad idea to provide some sort of guideline of how custom modules should be. For example, they should all have a name, their type should be custom etc. It is a bit difficult for me right now to make suggestions as I am not familiar with custom modules or their structure. I need some examples to play with and examine, then I should be able to offer some more insight.

I think either the custom modules need to pass their CUE recipe over a standardized API call or they have to validate themselves.

There are definitely many ways to handle them. I do not want to make any suggestions based on assumptions so once I have some idea of how custom modules work I should be able to come up with some potential solutions that we can then discuss before deciding.

What I really want is some recipes using custom modules. If anyone has any examples please share them 🙏

Yes, the schema right now only covers the built-in modules. The schema can of course be extended to cover any use case.

I'm developing the schema validation for BlueBuild (using TypeSpec under the hood), and solved the problem by using a generic CustomModule model when the module type is not something we have schemas for that accepts any parameters. This allows us to implement recipe validation that will not fail when unknown modules are used. (the build process will fail at a slightly if the module configured doesn't exist) The docstrings /** */ are shown by the code editor of users with a YAML TSP when hovering over the custom module configuration.

Very cool. Thank you for the input. CUE also supports the any type but also injection meaning that custom module authors could provide some input to the lint command to validate the recipe and their customizations. Saying that, I personally want to avoid as much as possible using any, but I am not sure yet how I will tackle custom modules. For now, I propose that we define at least some generic structure that all custom modules must follow and then see how we can best represent the genericness of the custom modules.

@taukakao
Copy link
Member

taukakao commented Jul 29, 2024

It is not a matter of limiting how custom modules look like but more about standardizing them

I think standardizing them limits them. It would make them second class modules, while they should have the same possibilities and freedoms that official modules have.

For now, the only implementation of a custom module that I can think of is https://github.com/Vanilla-OS/vib-fsguard
It's used for example in https://github.com/Vanilla-OS/desktop-image

@lambdaclan
Copy link
Contributor Author

It is not a matter of limiting how custom modules look like but more about standardizing them

I think standardizing them limits them. It would make them second class modules, while they should have the same possibilities and freedoms that official modules have.

Hello Tau. I definitely do not want to do anything that will limit the functionality of custom modules. On the contrary what I am aiming to do is to make them easier to author like with the actual recipe and built-in modules. Having a linter validating your recipes as you write them provides some sort of safety net before you even try to build your custom image. It will also make it easier to detect and fix any issues.

This can be further expanded once the CUE LSP is ready. It is still under development, but hopefully it will happen sooner than later. This will essentially allow us to write recipes while having the dynamic checks as shown above in the playground from within our IDEs.

For now, the only implementation of a custom module that I can think of is https://github.com/Vanilla-OS/vib-fsguard It's used for example in https://github.com/Vanilla-OS/desktop-image

OK this is great! Thank you for sharing it with me. This should be enough to at least start some brainstorming on how to approach custom modules. Once I have some suggestions I will be back to discuss in more detail.

@lambdaclan lambdaclan force-pushed the feat/recipe-validation branch from 62a03b5 to 55fe41f Compare July 30, 2024 01:35
@lambdaclan
Copy link
Contributor Author

Updates:

includes-module.webm
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    modules: [
      {
        name: "example-includes"
        type: "includes"
        includes: [
          "modules/00-vanilla-desktop.yml"
        ]
      }
    ]
    runs: {
      workdir: "/app"
      commands: [
        "npm run build"
      ]
    }
  }
]
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    modules:
      - name: example-includes
        type: includes
        includes:
          - modules/00-vanilla-desktop.yml
    runs:
      workdir: /app
      commands:
        - npm run build
copy-from-checker-improved.webm
id: "my-image-id"
name: "my-image"
stages: [
  {
    id: "build"
    base: "node:lts"
    copy: [
      {
        workdir: "/app"
        srcdst: {
          "/app/example.txt": "."
        }
      }
    ]
  },
  {
    id: "start"
    base: "node:lts"
    copy: [
      {
        from: "build"
        workdir: "/app"
        srcdst: {
          "/app/example.txt": "."
        }
      }
    ]
  },
  {
    id: "deploy"
    base: "node:lts"
    copy: [
      {
        from: "start"
        workdir: "/app"
        srcdst: {
          "/app/example.txt": "."
        }
      }
    ]
  }
]
id: my-image-id
name: my-image
stages:
  - id: build
    base: node:lts
    copy:
      - workdir: /app
        srcdst:
          /app/example.txt: .
  - id: start
    base: node:lts
    copy:
      - from: build
        workdir: /app
        srcdst:
          /app/example.txt: .
  - id: deploy
    base: node:lts
    copy:
      - from: start
        workdir: /app
        srcdst:
          /app/example.txt: .

@lambdaclan
Copy link
Contributor Author

Quicktip

You can easilly convert your Vib YAML recipes into cue using the CLI. This will allow you to test things out in the playground if needed.

test-validation.yml

name: Chronos Frontend Image
id: chronos-fe
stages:
  - id: build
    base: node:lts
    singlelayer: false
    labels:
      maintainer: Vanilla OS Contributors
    adds:
      - workdir: /tmp/user/files
        srcdst:
          go.mod: .
      - workdir: /tmp/otheruser/files
        srcdst:
          go.mod: .
          go.sum: .
    cmd:
      workdir: /app2
      exec:
        - python
        - prog.py
    entrypoint:
      workdir: /app
      exec:
        - npm
        - run
        - build
    runs:
      workdir: /etc/apt/apt.conf.d
      commands:
        - echo 'APT::Install-Recommends "1";' > 01norecommends
    expose:
      "6090": "6090"
    modules:
      - name: build-app
        type: shell
        workdir: /app
        source:
          type: git
          url: https://github.com/Vanilla-OS/chronos-frontend
          branch: main
          commit: latest
        commands:
          - mv /sources/build-app .
          - npm install
          - npm run build

cue import /path/to/vib/recipe.yml --outfile my-vib-recipe.cue

out

out

@axtloss
Copy link
Member

axtloss commented Aug 4, 2024

apples pkl has built in validation system and can be translated to yaml, would it maybe be easier to instead switch vib to use pkl instead of yaml for the recipes? Could reduce maintenance burden somewhat while having more features

@lambdaclan
Copy link
Contributor Author

lambdaclan commented Aug 5, 2024

apples pkl has built in validation system and can be translated to yaml, would it maybe be easier to instead switch vib to use pkl instead of yaml for the recipes? Could reduce maintenance burden somewhat while having more features

Hello there, thank you very much for contributing to the discussion! I am not sure if you caught up with the whole description of the PR (I know it is too long) but my current proposal is to use CUE for validation. There are of course other alternatives like Pkl that I do mention could also be considered.

My current solution completely covers the recipe validation without the need of switching the recipe format meaning that we can just keep on using YAML and get all the benefits of validation though the schema. Now, I also mention in the PR that if the team is happy with the overall checks and feel that YAML is just too finicky we could switch the recipe format to a different language, a config language that is. As you can expect since I spent all this time working on this, my suggestion would be to use CUE as the new format but only if necessary. For now, until things are thoroughly tested, I suggest that we stick with YAML and open the topic of the recipe language at a later stage.

The current PR introduces the lint command, but there are still no checks in the pre-build phase or in any automations like CI etc. Getting those implemented first (once the schema is tested and approved) will provide immediate benefits to all the Vib users. Also, there is no point to do all this work only for the lint command, we need to enable the same checks on build as soon as possible.

Switching the recipe language will be quite a big change and even if the team decides to do it some sort of grace period where both formats can be used is possibly needed (thankfully CUE provides a CLI that can convert YAML to CUE so huge win right there - see comment above). I am definitely up for switching the language but when the time is right. For example the CUE language server (under active development) could allow users to dynamically lint their recipes as they write them which would be a really nice experience (similar to the playground demos).

I am keeping this PR as draft for two reasons:

  1. Currently, custom modules are not supported so any recipe using custom modules will be marked as invalid. I am happy to say that I have already a working template that adds support for custom modules. I actually have 3 different ways which I will document in this PR for the team to decide which approach is best. I just need some more time to tidy things up and do some more testing.
  2. I have not received approval/confirmation from the team whether my proposal of using CUE is good enough to go upstream.

I would appreciate it if I can at least get some feedback regarding point 2 because If this is not needed or a different approach is more desirable then no point in spending even more time on this 🫠

Now regarding CUE vs Pkl. By no means I am no config language expert (I began learning CUE for this very PR) but unless Pkl provides some substantial benefits over CUE I do not see why we should switch to Pkl. I did consider Pkl amongst other languages, but CUE seemed to be the most mature one. Also CUE has first class and extensive support for Go lang (in fact CUE is inspired by Go hence the absence of things like classes etc), great tooling (CLI, playground, upcoming Language Server etc) and most importantly allow us to validate the recipe without any issues. When I was creating the schema I never felt CUE is too limiting or something.

from the CUE version 0.80 release notes (April 2024 so things are even better now possibly)

image

from the CUE language specification

image

Could the same validation be done with Pkl? Definitely and many other config languages for that matter. This is highly subjective and at end of the day the real answer is possibly "it does not really matter". Modern config languages like CUE, Pkl, Dhall etc offer big improvements and benefits over using something like YAML, TOML or JSON. Someone needs to make the final decision but for obvious reasons my vote goes to CUE.

Thank you all and keep the discussion going!

References:

@axtloss
Copy link
Member

axtloss commented Aug 5, 2024

I'm not sure what the point of repeating all this information is, I am aware that your current implementation uses cue, my proposal is to use pkl instead.
Instead of having to use yaml and cue a switch to pkl would allow us to remain at using only one language, which is generally easier to maintain.
pkl also has first class go support, being one of the languages that was supported pretty much since pkl released.
pkl would also fit better for vib as its built in module support which can also pull modules from the web, meaning that it would require even less code than the current yaml implementation as things like the includes module would become much simpler due to being handled by pkl.

@taukakao
Copy link
Member

taukakao commented Aug 5, 2024

I think it would be best to list the benefits of using CUE here compared to pkl. I'm sure there are some.

Then we can better discuss both ideas.

@lambdaclan
Copy link
Contributor Author

I'm not sure what the point of repeating all this information is, I am aware that your current implementation uses cue, my proposal is to use pkl instead.

Apologies for going overboard with the explanations.

Instead of having to use yaml and cue a switch to pkl would allow us to remain at using only one language, which is generally easier to maintain.

The only reason both YAML and CUE are needed is that the scope of this PR is to add validation to the current recipe format which is YAML. The schema is written in CUE and can be used to directly validate the Vib YAML recipe. If we were to assume that the Vib recipes will also be written in CUE then like Pkl only one language will be needed.

pkl also has first class go support, being one of the languages that was supported pretty much since pkl released.

Yes, of course I am aware of that. It just seems to be a bit OOP oriented but nothing wrong with that.

pkl would also fit better for vib as its built in module support

CUE also has support for packages and modules.

which can also pull modules from the web, meaning that it would require even less code than the current yaml implementation as things like the includes module would become much simpler due to being handled by pkl.

CUE can also have remote repositories for modules.

I think it would be best to list the benefits of using CUE here compared to pkl. I'm sure there are some.

To be honest I would not say one language is better than the other one. They are both great, they are both trying to solve the same problems albeit a bit differently due to differences in philosophy. I do not feel either language provides groundbreaking features that set it apart. Both can get the job done.

Generally speaking, using a config language is infinitely better than using a static config format. Also keep in mind CUE and Pkl are not the only great options out there 😊

I will not do any more work on this until we reach a final decision. If you decide to proceed with CUE I will carry on and adjust as needed after the feedback. If you decide to go with Pkl or any other solution we can just close this PR and move on. Even if CUE is not selected I will be more than happy to help If no one is willing to take the task or support is required.

@axtloss
Copy link
Member

axtloss commented Aug 6, 2024

Yes, of course I am aware of that. It just seems to be a bit OOP oriented but nothing wrong with that.

which would make it better fit for this as the vib recipes are also based on objects and classes

CUE also has support for packages and modules.

from what I'm reading this seems to require a lot of extra work than pkls module support, since cue requires some registry thing and all that

To be honest I would not say one language is better than the other one. They are both great,

that is true, but one would be better fit than the other one for this job

The only reason both YAML and CUE are needed is that the scope of this PR is to add validation to the current recipe format which is YAML.

I'd rather switch to a different config language altogether which would allow validation than having to settle with a two language solution, since my general plan with vib is to move it to a better fit config language. If that language ends up being CUE then we can just "repurpose" this pr to switch vib to cue.

@lambdaclan
Copy link
Contributor Author

Do we have any updates on the state of this PR? Conflicting files are already popping up, so please let me know how you would like to continue going forwards when you get a chance 🙏 I am assuming since it is holiday season some people might be away, if that is the case no worries.

Thank you!

@axtloss
Copy link
Member

axtloss commented Aug 19, 2024

I still find it better suited to directly move to pkl here
@mirkobrombin what do you think?

@mirkobrombin
Copy link
Member

After reading the entire discussion, I lean towards PKL. If PKL already supports features that would require re-implementation in CUE, it makes sense to consider it now. Anyway, it's crucial to maintain YAML as the default format, as it's what keeps Vib simple. Advanced configuration languages like PKL can be optional for those who need them.

@lambdaclan
Copy link
Contributor Author

@mirkobrombin

After reading the entire discussion, I lean towards PKL. If PKL already supports features that would require re-implementation in CUE, it makes sense to consider it now.

No problem at all. Better make the hard decisions early on in the process.

Anyway, it's crucial to maintain YAML as the default format, as it's what keeps Vib simple. Advanced configuration languages like PKL can be optional for those who need them.

I guess this means that we should have a schema to do the linting via PKL instead of CUE while keeping the standard format as YAML. We could make it possible to accept either format (YAML or PKL) for the recipes but just assume the default is YAML.

@axtloss Just to clarify, are you going to take over this task? I will be happy to adjust the work done in this PR to use PKL instead of CUE (there bound to be some similarities) to implement the lint command as a first step. We can then proceed to add support for PKL based recipes along with any other requirements. Maybe we should create a new issue/task list to keep track of things.

For now just let me know if you would like me to help out with this!

Thank you both.

@lambdaclan lambdaclan closed this Sep 1, 2024
@axtloss
Copy link
Member

axtloss commented Sep 3, 2024

Just to clarify, are you going to take over this task? I will be happy to adjust the work done in this PR to use PKL instead of CUE (there bound to be some similarities) to implement the lint command as a first step. We can then proceed to add support for PKL based recipes along with any other requirements. Maybe we should create a new issue/task list to keep track of things.

sorry for not responding to this earlier
I don't really care who does it, if it's something you want to do you can work on it
creating a new issue is probably a good idea, I'll make one in a bit, for now we'd only do the linting part, like you suggested

@axtloss axtloss mentioned this pull request Sep 3, 2024
3 tasks
@lambdaclan
Copy link
Contributor Author

sorry for not responding to this earlier

No worries, I myself got busy with work lately.

I don't really care who does it, if it's something you want to do you can work on it
creating a new issue is probably a good idea, I'll make one in a bit, for now we'd only do the linting part, like you suggested.

Sounds good, I will reply on the ticket so if you want you can assign me to it. I will start working on it as soon as possible and open a PR early so that we can discuss further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants