Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new YAML based format for "conda-build" files #54

Merged
merged 13 commits into from
Oct 30, 2023
Merged

Conversation

wolfv
Copy link
Contributor

@wolfv wolfv commented Jun 2, 2023

@wolfv wolfv changed the title add cep 20.1 for only the YAML format A new YAML based format for "conda-build" files Jun 2, 2023
@marcelotrevisani
Copy link
Member

I think that would be very useful, it will reduce the complexity to parse those yaml comments that are a pain.
But I also think that if we could move to use toml files instead of yaml mixed with jinja2 would be way better but that is another history 😬

@wolfv
Copy link
Contributor Author

wolfv commented Jun 2, 2023

How would TOML files work? Wouldn't we still need some Jinja here and there?

The good thing about the pure YAML format is that we can translate it to either the next evolution of the YAML format or a TOML based format quite easily.

In fact, someone has already added TOML support (more specifically recipe embedding inside a pyproject.toml file in boa - precisely because it is possible).

@marcelotrevisani
Copy link
Member

marcelotrevisani commented Jun 2, 2023

How would TOML files work? Wouldn't we still need some Jinja here and there?

The good thing about the pure YAML format is that we can translate it to either the next evolution of the YAML format or a TOML based format quite easily.

In fact, someone has already added TOML support (more specifically recipe embedding inside a pyproject.toml file in boa - precisely because it is possible).

regarding toml files I was thinking preprocess it and postprocess it if necessary
something like poetry but on steroids, 😆
like

[requirements.host]
pytest = {version=">=5.0.0,<6.0.0", platform=["linux", "osx"], preprocess="func_to_preprocess", postprocess="func_to_postprocess"}

if it needs preprocessing it could call some predefined functions to do some magic that we might need

if we need variables to be used inside it can use

[env.variables]
my_var="abc"

[package]
name="${my_var}"

and it can inject it and expand when processing the recipe, if I am not mistaken there is a crate that does these kinda of env var expansion.

but as I mentioned, that is another whole history 😄

What you have proposed I think would improve and simplify a lot of parsing and when we need to pragmatically create the recipes. But I wish to get ride of all the jinja, because they are also a pain sometimes 😆 , however, I also understand that would be very difficult to remove them

Copy link
Member

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some general remarks about the design, mostly about the use of chained functions for the selectors while evaluating the boolean expressions.

I'd prefer using YAML objects, given they enable us to evolve the items progressively over time and have a better user experience when having to chain many conditions. I also like the "blocky" look.

E.g. this becomes unwieldy when we'd add another function in the future via another spec update:

sel(cmp(python, "3.11") and foo(bar, "spam"))

While this would just be another line to optionally add:

 requirements:
   build:
     - name: dataclasses
       if: python >=3.6 and python <3.10
       foo: bar == "spam"
     - cryptography

cep-20.1.md Outdated Show resolved Hide resolved
cep-20.1.md Outdated Show resolved Hide resolved
cep-20.1.md Outdated

### The cmp function for variant selection

Furthermore, we have a special "cmp" function that can be used to run a check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerned that there might be more needs for "special" functions like this and that it could end up recreating Jinja2, bascially

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have "access" to jinja2 (e.g. to lowercase a string, extract a character from a string, join things etc.). This is about extending it with +1 function specific to conda matchspec comparison and variant-conditional builds (e.g. add this dependency only for python 3.8. I think it warrants a special function. I might however remove it from the proposal for now as it seems to be a point of contention ... and then we introduce a new proposal for this functioanlity.

cep-20.1.md Outdated
against a selected variant version. The `cmp` function looks like the following:

```
cmp(python, "3.6")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we have to use cmp here? We're evaluating the boolean expressions and it might be easier to understand with this:

Suggested change
cmp(python, "3.6")
if: python >= "3.6"

Where it would then look like:

 requirements:
   build:
     - name: dataclasses
       if: python >=3.6 and python <3.10
     - cryptography

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that operator overloading might be harder to implement vs. the cmp function. With the cmp function we'd know that the left hand side is the variant specifier and the right hand side is a version number. We could do python >= Version("3.6") basically, but that is even trickier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe if we know that the left hand side is a VariantSpecifier we could overload the comparison with strings ... but yeah, would need to figure out if that's easy to implement with the Jinja expression system(s).

Copy link

@remkade remkade Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I also find the if: style quite a bit more intuitive. Its similar to how ansible does its conditionals which should help new people transition easier.

You could also do something like unless: python > 3.10 as well, which makes negated conditionals a bit cleaner.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on if as it is consistent with CI providers as well

Copy link
Member

@jakirkham jakirkham Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One caveat with cmp specifically is some languages use that to mean cmp(x, 3) to mean:

  • cmp(x, 3) > 0 # x > 3
  • cmp(x, 3) == 0 # x == 3
  • cmp(x, 3) < 0 # x < 3

Not to bikeshed, but maybe the critiques in the comments above (and the caveat around cmp) can be resolved with a different name? Some options:

  • within(...)
  • contains(...)
  • includes(...)

Same syntax, but maybe this already implies the conditional case evaluating to true? We could also use a different tense (contains -> contained) if preferable

cep-20.1.md Outdated Show resolved Hide resolved
cep-20.1.md Outdated

## Templating with Jinja

The spec supports simple Jinja templating in the `recipe.yaml` file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth elaborating what "simple" means in this context, what wouldn't be covered by this?

Would this also work with MiniJinja for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple means:

  • Nothing that could go between {% ..., means no for loops, no set, no if / else etc.
  • Everything that can go between {{ ... }} is still supported, including the pipe operator, so that {{ version | lowercase }} and things work. They can be assigned to helper variables inside the context, e.g.
context:
  version: "0.1.2.POST1234"
  lower_version: "{{ version | lower }}"  # evaluates to 0.1.2.post1234
  • Inside the selectors, the Jinja expression language is used which only knows a few primitive types (str, int, float, ...).

This works fantastically with MiniJinja (which is what we use in rattler-build). However, MiniJinja would support much more than what we are currently supporting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to preserve the meta.yaml filename and cause conda-build or rattler-build to detect the format from its contents?

cep-20.1.md Outdated
The conda-build format has grown over the years to become quite complex.
Unfortunately it has never been formally "specified" and it has grown some
features over time that make it hard to parse as straightforward YAML.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one really important motivating factor that isn't mentioned in the motivation section is increasing the ease by which we can parse and modify these files (making them more machine readable friendly). I know that there is currently a desire to be able to do this, but the old format holds us back. It's very hard to parse and multiple attempts have already been made to try to do this if I'm not mistaken.

I know it's mentioned below, but it might be worth emphasizing here briefly too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1000

@wolfv wolfv mentioned this pull request Jun 7, 2023
6 tasks
@travishathaway
Copy link
Contributor

travishathaway commented Jun 7, 2023

One thing that also came to mind as I was reading the CEP was what the "migration strategy" would be for existing recipes. You mention this a little in the "short comings" section, but maybe it's worth it to have its own section.

Unfortunately, migration will be a PITA because it will most likely involve manually updating recipes (will it?), but maybe there's a tool we could write that could handle translating 90% of the of existing recipes (I'm thinking like a Python 2 to 3 migration tool).

Or, maybe migration isn't even a concern 🤷‍♂️. It should probably be mentioned somewhere in the document.

@wolfv
Copy link
Contributor Author

wolfv commented Jun 7, 2023

Thanks for the comments @travishathaway. Indeed, migration is a concern. In boa we have a small & hacky tool that can convert simple recipes (boa convert meta.yaml > recipe.yaml).

It works by analyzing a line, and if it finds a comment it moves the comment right after the - and wraps it with a sel(...). It also parses the {% set ... expressions and sticks them in a context dictionary.

But obviously this works only for relatively straightforward cases.

cep-20.1.md Outdated

Selectors in the new spec take the following format:

`sel(unix): selected_value`
Copy link

@skupr-anaconda skupr-anaconda Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a simplified variant:
(unix): selected_value
(win and arm64)
((osx or linux) and aarch64)
(something == "test")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is also valid YAML:

Screenshot 2023-06-07 at 14 49 00

However, it would effectively special case all dictionary keys that start with a round bracket. I think that would be OK, but less explicit than the sel(...) operator.

Copy link
Member

@jakirkham jakirkham Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to bikeshed, but what about something like is(...) so is(unix)?

@JeanChristopheMorinPerso

I'm curious, the title says "A new recipe format – part 1". Is there a part 2 somewhere or some written notes about a possible future part 2?

@wolfv
Copy link
Contributor Author

wolfv commented Jun 7, 2023

Hi @JeanChristopheMorinPerso indeed, there is the "full" proposal (of which this is a derivative) here: #53

Also, all this work is based on 2+ years of work on the boa "build tool" as well as the rattler-build build tool. Both are many times faster than the current conda-build and implement this recipe format :)

cep-20.1.md Outdated
recipe with a computer
- iron out some inconsistencies around multiple outputs (build vs. build/script
and more)
- remove any need for recursive parsing & solving
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recursive parsing and solving was mainly to support things like inheriting specs, like run_constrained, from explicitly specified dependencies. Is there any kind of explicit spec for whether/how packages add/impose dependencies on sections outside of the one where they are specified?

What about the pin_compatible jinja? I'm getting the sense that it will be cut in this spec. I'm not sure whether you could impose limits on it to avoid the need to be recursive. It is very useful, so I hope it can stay somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In boa (and rattler-build) we have "deferred" evaluation for the pins, so that still works. We just require that the name of the pin is available, in order to build the proper topologically sorted build-graph :)

A dependency list in rattler-build contains an "enum" of type:

https://github.com/prefix-dev/rattler-build/blob/665c981a392a6ebeba5842eee1d968b8753b2dcd/src/render/dependency_list.rs#L22-L27

Indeed, pin_subpackage is not yet implemented but quite trivial to achieve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes the package name it isn't known. For example cross-compilation usage or compilers themselves

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakirkham in these cases you can infer those variables from the build-time config though (or variant config) so no recursion needed :)

Copy link

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit that I'm probably missing a bunch of context of why moving to "pure YAML" is such a self-evidently good idea (though I'm a big fan of the mamba/boa effort!).

To me, jinja is a very standard templating engine, and it's not obvious from the OP (nor #53, nor the links I found), what problems we're solving.

So, not knowing what improvements we're "buying" with this, I mainly see the costs, which to me are definitely non-zero.

cep-20.1.md Outdated
Comment on lines 192 to 197
For example, using a `{% for ... %}` loop is prohibited. After searching through
the conda-forge recipes with the Github search, we found for loops mostly used
in tests.

In our view, for loops are a nice helper, but not necessary for many tasks: the
same functionality could be achieved in a testing script, for example. At the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that it's generally possible to move stuff to test scripts, this is IMO very disadvantageous:

  • far away from meta.yaml (if they use the "magic" test script names, they're not even mentioned and then often not up-to-date because people forget to look at them)
  • duplication between platforms (aside from the pain of writing loops in batch)

For example, it would be a real pain to rewrite the following

        {% for each_lib in core_libs + core_cpp_libs + vendored_libs %}
        # presence of shared libs
        - test -f $PREFIX/lib/lib{{ each_lib }}.so              # [linux]
        - test -f $PREFIX/lib/lib{{ each_lib }}.dylib           # [osx]
        - if not exist %LIBRARY_BIN%\{{ each_lib }}.dll exit 1  # [win]
        - if not exist %LIBRARY_LIB%\{{ each_lib }}.lib exit 1  # [win]

        # absence of static libs (unix)
        - test ! -f $PREFIX/lib/lib{{ each_lib }}.a             # [unix]
        {% endfor %}

        # binaries
        {% for each_lang in binaries_plugin_langs %}
        - test -f $PREFIX/bin/grpc_{{ each_lang }}_plugin                    # [unix]
        - if not exist %LIBRARY_BIN%\grpc_{{ each_lang }}_plugin.exe exit 1  # [win]
        {% endfor %}

        # pkg-config (no metadata for vendored libs)
        {% for each_lib in core_libs %}
        - pkg-config --print-errors --exact-version "{{ core_version }}" {{ each_lib }}
        {% endfor %}
        {% for each_lib in core_cpp_libs %}
        - pkg-config --print-errors --exact-version "{{ version }}" {{ each_lib }}
        {% endfor %}

to the point that those tests would just not be written (or rather: maintained) as well as now (I moved them out of the heavily outdated test scripts for exactly that reason).

Now, it's possible that this is still a worthwhile trade-off compared to what "pure YAML" gets us, but it will reduce the quality of our testing (at least, as someone who's added a lot of test coverage to feedstocks I got involved with, I believe it will strongly impact my output that front).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two options:

  • We can allow Jinja (plus the values from context) inside the script text blocks. I wouldn't mind that too much since it doesn't affect the recipe execution at all.
  • You can rewrite your tests to take advantage of the new test helpers in boa. In that case it would look something like this:
      exists:
        lib:
          - gpr
          - grpc
          - grpc_unsecure
          - grpc++
          - grpc++_unsecure
          - address_sorting
          - upb
        glob:
          # absence of static libs (unix)
          - ~ **/*.a
        bin:
          - grpc_cpp_plugin
          - grpc_csharp_plugin
          - grpc_node_plugin
          - grpc_objective_c_plugin
          - grpc_php_plugin
          - grpc_python_plugin
          - grpc_ruby_plugin
        pkg_config: 
          - grpc
          - grpc++
          - grpc_unsecure
          - grpc++_unsecure
      commands:
        # final CMake test to compile examples
        - ./test_grpc.sh   # [unix]
        - ./test_grpc.bat  # [win]

Note that this is implemented in a prototype-y fashion in boa and not yet implemented in rattler, and we'd spec this out in another CEP. Of course, there is duplication of the core libs. We could think of ways of being able to refer to lists that are defined elsewhere to make that less repetitive but IMO that can come later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also hypothetically allow lists to be defined in the context, and then use some other syntax to extend other lists. For example:

context:
  core_libraries: ["grpc", "a", "b"...]

tests:
  exists:
    lib:
      - @ core_libraries

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the comment I left in the other thread was meant for both conversations - sorry if that wasn't clear.

I think it'll be manageable in the way that you propose. Being able to define lists would be nice, but isn't critical IMO.

cep-20.1.md Outdated
Comment on lines 198 to 200
same time we also plan to formalize a more powerful testing harness ([prototyped
in
boa](https://github.com/mamba-org/boa/blob/main/docs/source/recipe_spec.md#check-for-file-existence-in-the-final-package)).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks very nice, but it's critically missing a way to check for the absence of files (which is crucial for many recipes, especially multi-output ones).

It would actually be a nice improvement over what conda currently does to check for absence of files in the output under testing, rather than just looking at the path in the prefix. That's because a given path might be satisfied -- intentionally -- by another output; for example, it's currently not possible to check1 that in a lib + bin + bindings setup, the binary output (that's depending on the lib) does not accidentally package any library artefacts.

Footnotes

  1. without an extra output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just came up with a syntax to test the absence (negation) of the statement by using the ~ prefix. This shoudl be fairly straightforward to add. And as you suggested, these functions would operate on the paths.json of the package (and not the environment).

But on the other hand this is a bit off topic since this is not the right CEP for that :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. conda doesn't do anything -- what people do is write ugly bash scripts that need special knowledge to handle Windows vs. Unix ... which is why testing is generally pretty bad on conda-forge. Another thing that we want to fix :) (but off-topic).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the gitignore syntax (e.g. ! for negation)? Devs are already familiar with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ! has a special meaning in YAML so you'd have to quote it ... :) YAML is a complex spec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy if there's a way to test absence of files, syntax is secondary to me. That said:

YAML is a complex spec

I'm not deeply enough in this whole discussion, but I'd be thrilled to move to TOML as well. One thing that scares me about the sel / if discussion is that YAML breaks horrifically if you get the indentation levels wrong within a list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since many folks are already coming from Python, think significant whitespace is already an accepted thing

Not to say moving to a simpler markup language might be beneficial for us, but that seems like a separate discussion altogether. Right now Wolf is trying to just move use from 2 meta languages and 1 markup language to just 1 markup language, which would be a huge improvement over the status quo. Maybe a future discussion could be moving from 1 markup language to another (once we are fully in 1 :)

@wolfv
Copy link
Contributor Author

wolfv commented Jun 8, 2023

The current state of affairs is that we have 2 mini-languages + YAML (comments & Jinja) for the recipes. That makes it really hard to write a parser for it to statically extract information from the recipe or transform it in any way.

The way Jinja support is implemented in conda-build is also quite tricky: it does recursive evaluations of the recipe to find all the necessary values for the used Jinja variables. This contributes greatly to conda-build being extraordinary slow for larger recipes (especially with multiple outputs).

I think the way Jinja is used in conda-build currently is not very deliberate. It's usage is completely unrestricted, but most recipes only use a tiny subset (string interpolation & one or two "set" statements). However, the possibilities of Jinja blur the line between: is this YAML file a script or is this YAML file a static description of the package? Right now it's both, somehow, and not good at either.

If we have decide on a somewhat more limited YAML format as input, it will be very straight-forward to generate the recipes using Python or any other language (we are actually doing this for ROS recipes in the RoboStack project where we've been using boa for 2+ years now). So that can be an "escape-hatch" for some recipes that are more annoying to write without full Jinja support. At the same time, the old recipe format will be supported indefinitely so it's also no problem to stay with what is there.

Another nice property of the new recipe format will be the effortless evolution of it. Since it's static YAML we can come up with version 2 and make a migrator that rewrites the recipe to the new spec - something that's super hard to do with the current format.

So to summarize the motivations:

  • we get rid of a hybrid script/static thing
  • we remove many complications in parsing & evaluating recipe files
  • we only loose functionality for very few recipes for which we can figure out better ways
  • generating and migrating recipes will be super straightforward

@wolfv
Copy link
Contributor Author

wolfv commented Jun 8, 2023

Another motivating example: we have a version autotick bot in the boa-forge collection of recipes and it requires very little code (~600 lines of Python). The part about reading & writing the recipe is trivial: https://github.com/mamba-org/boa-forge/blob/de9a92f755f21f736e380b39bba925c755b1e35f/determine_versions.py#L607-L648

It produces PRs such as: https://github.com/mamba-org/boa-forge/pull/131/files

wolfv and others added 2 commits June 8, 2023 11:33
Co-authored-by: Jannis Leidel <jannis@leidel.info>
Co-authored-by: Jannis Leidel <jannis@leidel.info>
@baszalmstra
Copy link
Contributor

Overall I think this is a great step. Having a more easily machine-readable recipe would open the doors for more tools to generate and parse recipes.

I do also prefer the aforementioned if syntax over the sel(..) syntax because:

  • No need for the weird key formatting. sel(..) as a key looks strange in yaml land.
  • It's the same approach used by Github CI.
  • I think it's a bit easier to read.
  • We still use minijinja (or whatever) to evaluate the selectors which is still a second language. If we use the if syntax we could later also extend that to make it more machine-readable as well!
    if:
        or:
            - platform: win32 # not a minijinja expression
            - platform: osx

This is what it could look like (snippet is actually created by @wolfv , I only changed it slightly). Note that the value for if is also just a minijinja expression just without the {{ }} (this is also similar to Github CI).

context:
  name: xtensor
  version: 0.24.6
  sha256: f87259b51aabafdd1183947747edfff4cff75d55375334f2e81cee6dc68ef655

package:
  name: "{{ name|lower }}"
  version: "{{ version }}"

source:
  fn: "{{ name }}-{{ version }}.tar.gz"
  url: https://github.com/xtensor-stack/xtensor/archive/{{ version }}.tar.gz
  sha256: "{{ sha256 }}"

build:
  number: 0
  skip:
    - if: osx or win # note that the value is a minijinja expression
      value: true
    - if: win
      # Defaults to: (because we can easily derive this from the context) 
      # value: true

requirements:
  build:
    - "{{ compiler('cxx') }}"
    - cmake
    - if: unix
      value: make
  host:
    - xtl >=0.7,<0.8
  run:
    - xtl >=0.7,<0.8
  run_constrained:
    - xsimd >=8.0.3,<10

test:
  commands:
    - if: unix
      value:
        - test -d ${PREFIX}/include/xtensor
        - test -f ${PREFIX}/include/xtensor/xarray.hpp
        - test -f ${PREFIX}/share/cmake/xtensor/xtensorConfig.cmake
        - test -f ${PREFIX}/share/cmake/xtensor/xtensorConfigVersion.cmake
    - if: win
      value:
        - if not exist %LIBRARY_PREFIX%\include\xtensor\xarray.hpp (exit 1)
        - if not exist %LIBRARY_PREFIX%\share\cmake\xtensor\xtensorConfig.cmake (exit 1)
        - if not exist %LIBRARY_PREFIX%\share\cmake\xtensor\xtensorConfigVersion.cmake (exit 1)

@wolfv
Copy link
Contributor Author

wolfv commented Jun 8, 2023

I honestly don't care so much for either approach. I think the sel(...) syntax is more terse which might be nice.

There is one feature that is hard to express with the if syntax which is a dictionary with multiple keys, e.g.:

build:
  number: 0
  script:
    sel(unix): |
      cmake ${CMAKE_ARGS} -DBUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=$PREFIX $SRC_DIR -DCMAKE_INSTALL_LIBDIR=lib
      make install
    sel(win): |
      cmake -G "NMake Makefiles" -D BUILD_TESTS=OFF -D CMAKE_INSTALL_PREFIX=%LIBRARY_PREFIX% %SRC_DIR%
      nmake
      nmake install

However, there was also a bit of internal debate about this because we would need to define the behavior if more than one sel(...) evaluates to true. In that case, in the current implementations, only the first one wins.

@baszalmstra
Copy link
Contributor

baszalmstra commented Jun 8, 2023

I don't see why that would be hard. Instead of a dictionary, just use a list:

build:
  number: 0
  script:
    - if: unix
      value: |
        cmake ${CMAKE_ARGS} -DBUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=$PREFIX $SRC_DIR -DCMAKE_INSTALL_LIBDIR=lib
        make install
    - if: win
      value: |
        cmake -G "NMake Makefiles" -D BUILD_TESTS=OFF -D CMAKE_INSTALL_PREFIX=%LIBRARY_PREFIX% %SRC_DIR%
        nmake
        nmake install

However, I think we could do even better if the selector evaluation part is part of the model. The current implementation in Rust does two passes over the yaml. First, it evaluates the selectors which generates a new yaml file and then it parses it into a specific model format. While flattening the structure it can become unclear whether a list, a string, or an object was expected.

I think it would be better to make the selectors part of an explicit model, evaluate that model with the selectors, lowering it into another (evaluated) model. This will most likely also give the user much better error messages if two selectors accidentally result in two valid values. It also allows us to generate a JSON schema that can be used in IDE's to get autocompletion support.

@wolfv
Copy link
Contributor Author

wolfv commented Jun 8, 2023

The biggest fear of the if: ... based syntax is that we now think that it's more "correct" or something like that, but then it might present a hurdle to adoption because in effect it's longer / more annoying to type :)

@wolfv
Copy link
Contributor Author

wolfv commented Jun 9, 2023

We had some more discussions today and came up with some changes:

  • instead of forcing the user to quote "{{ jinja }}" variables (which always coerces them to be strings in YAML) we would rather go the Github route and use ${{ jinja }} (which plays well with YAML), or use another bracket, like ((. However we're leaning towards the Github syntax.
  • this also enables us to use the ternary syntax in places where the YAML if would be complicated to use. Moreover, the ternary if is infinitely nestable. E.g. this is possible:
build:
  number: ${{ 0 if win else 100 }}

and has the correct integer type etc.

We also looked at the challenge of the following example: suppose win needs an MD5 hash, and unix needs a SHA256 for a given source file (a bit contrieved, admittedly). With the if or sel syntax, the cleanest would be to repeat the block like:

source:
  - if: win
    url: https://github.com/killercup/cargo-edit/archive/refs/tags/v{{ version }}.tar.gz
    sha256: 46670295e2323fc2f826750cdcfb2692fbdbea87122fe530a07c50c8dba1d3d7
  - if: unix
    url: https://github.com/killercup/cargo-edit/archive/refs/tags/v{{ version }}.tar.gz
    md5: 0800fc577294c34e0b28ad2839435945   

Although in this case it would be possible to solve with a Jinja if as well.

For lists, the syntax would look like:

    - if: win
      then:
      - cmake
      - ninja
      else:  # optional
      - ...

The missing then makes the syntax nicer for dictionaries, but also make it less homogenous.

cep-20.1.md Outdated Show resolved Hide resolved
cep-20.1.md Outdated
Comment on lines 144 to 146
## Templating with Jinja

The spec supports simple Jinja templating in the `recipe.yaml` file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly since we are doing a bit of a redesign here, think it is worth asking whether we still want Jinja to play a role here. In other words, if we were not already using Jinja today, how would we have solved these use cases? How do others approach YAML templating? Could we think of pure YAML solutions?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, what does jinja accomplish that we can't do with the other new features?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well other new features would also have to be implemented and I am not sure we would necessary end up with something simpler.
I think there is a future where the entire build tool looks a bit different (and the recipe as well) and wolfi-os / melange have a nice approach: https://github.com/wolfi-dev/os/blob/main/libidn2.yaml

But I am not sure how to proceed here - we could either go down that road right now or build something that is more reasonable for people to port existing recipes to and then iterate in a next step on the recipe format (when we have the necessary infrastructure nicely implemented in rattler-build).

@jakirkham
Copy link
Member

Thanks Wolf! 🙏

Really appreciate you taking the time to write this up with nice examples. Think there are a lot of good improvements here. The biggest simply being in one markup language

Asked a few questions and made a few minor suggestions above. Will probably need to mull on this more and bring a few more examples into the discussion so we can think about what it would mean to move recipes into a format like the one proposed here

Another question that comes up for me frequently, which may be off-topic for this, but I'd like to at least put it out there. The multi-output format that we have today was a bolt on to an existing single output format. Though I wonder if we could do better by moving to some kind of model where the multi-output format is a first-class citizen. In any event this may be CEP in its own right, but I think this would be a valuable discussion to have and would be very beneficial for recipe maintainers. Something to think about 🙂

@ocefpaf
Copy link

ocefpaf commented Oct 20, 2023

yes

2 similar comments
@xhochy
Copy link

xhochy commented Oct 20, 2023

yes

@wolfv
Copy link
Contributor Author

wolfv commented Oct 20, 2023

yes

@beckermr
Copy link

my vote doesn't count, but I am 100% supportive of this and very excited to see it moving ahead!

@marcelotrevisani
Copy link
Member

Yes! 🚀

@msarahan
Copy link
Contributor

Great work on this. Yes.

@kkraus14
Copy link

Yes

@mariusvniekerk
Copy link

yes

@goanpeca
Copy link

Yes

@wolfv
Copy link
Contributor Author

wolfv commented Oct 23, 2023

Wow, I am counting 11 yes votes (out of 15 council members) which appears to me as if the proposal is accepted!

@jezdez
Copy link
Member

jezdez commented Oct 23, 2023

@wolfv We should wait until the end of the voting period to tally all votes, FWIW

@mbargull
Copy link
Member

Yes.

.. But ideally with the main points from #54 (review) addressed. I'll try to compose a changeset and issue a pull request against this one (if not done by tomorrow, it will have to wait for a subsequent CEP).

@wolfv
Copy link
Contributor Author

wolfv commented Oct 29, 2023

I guess we can now celebrate that this has been accepted?! :) 🎉
I'll update and merge the CEP tomorrow with the appropriate number and status.

@marcelotrevisani
Copy link
Member

I guess we can now celebrate that this has been accepted?! :) 🎉 I'll update and merge the CEP tomorrow with the appropriate number and status.

Can I start to modify souschef and grayskull already? 😄

@wolfv
Copy link
Contributor Author

wolfv commented Oct 30, 2023

Voting Results

This was a standard, non-timed-out vote.

Among Steering Council members there are 12 "yes", no "no", and no abstentions.
This vote has reached quorum (at least 60% of 15).

It has passed since it recorded 12 "yes" votes and 0 "no" (80% yes out of 15 eligible voters).

@wolfv wolfv merged commit 9063eba into conda:main Oct 30, 2023
@wolfv wolfv deleted the cep_20.1 branch October 30, 2023 10:26
@dholth
Copy link
Contributor

dholth commented Oct 30, 2023

Does this diverge from the rattler book - sel(osx): somepkg instead of - somepkg # [osx] , Jinja string interpolation needs to be quoted at the beginning of a string, e.g. - "{{ version }}"

@baszalmstra
Copy link
Contributor

Does this diverge from the rattler book - sel(osx): somepkg instead of - somepkg # [osx] , Jinja string interpolation needs to be quoted at the beginning of a string, e.g. - "{{ version }}"

@dholth Yes it does, that is not correct. That should have been updated in the book, can you point me to where it says that?

@dholth
Copy link
Contributor

dholth commented Oct 30, 2023

@jakirkham
Copy link
Member

Oops meant to say "Yes!"

Thanks for working on this Wolf and everyone for reviewing! 🙏

@wolfv
Copy link
Contributor Author

wolfv commented Oct 31, 2023

@dholth we also publish the rust book here: https://prefix-dev.github.io/rattler-build/ where it should be updated. At the same time, we haven't made a release of rattler-build with the new parser.

@dholth
Copy link
Contributor

dholth commented Oct 31, 2023

@wolfv That link has the same text on the recipe format page.

Jinja string interpolation needs to be quoted at the beginning of a string, e.g. - "{{ version }}" in order for it to be valid YAML
Selectors use a YAML dictionary style (vs. comments in conda-build). E.g. - sel(osx): somepkg instead of - somepkg # [osx]

And then these a little further down do not contain quotes?

package:
name: ${{ name }}
version: ${{ version }}

@wolfv
Copy link
Contributor Author

wolfv commented Oct 31, 2023

A PR can be made on: https://github.com/prefix-dev/rattler-build/tree/main/docs
Otherwise we'll fix it asap :)

@SylvainCorlay
Copy link

I think that it would be interesting to create a custom extension for these new pure yaml recipes. Instead of calling the file meta.yaml, it could be whatever.conda-recipe.yaml, or something like this.

The main benefit of having a custom extension is that we could enable developer tools for that extension in IDEs, such as auto-complete for dependency names, or warning users when a recipe does not parse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vote Voting following governance policy
Projects
None yet
Development

Successfully merging this pull request may close these issues.