Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal - Support user defined functions in Bicep #9239

Closed
SimonWahlin opened this issue Dec 8, 2022 · 6 comments · Fixed by #10465
Closed

Proposal - Support user defined functions in Bicep #9239

SimonWahlin opened this issue Dec 8, 2022 · 6 comments · Fixed by #10465
Labels
enhancement New feature or request proposal

Comments

@SimonWahlin
Copy link
Collaborator

Proposal - Support user defined functions in Bicep

Problem statement

In most scenarios I can use variables to store the output of a complex expression and use that output in several places. However, when using complex expressions inside a lambda expression where there is a dependency on a local variable, I would need to re-use the expression, not the output of the expression. This would be a great fit for user defined functions (UDF).

Background

We already have support for UDF in ARM templates. The problem with UDF in ARM in my opinion is that they are lengthy and quite complex to define. With bicep they could be a lot shorter and still give us the same functionallity. There are also a few limitations documented. I see the "limitation" that a UDF cannot access parameters or variabels outside its own scope as a positive thing since depending on external scope would make them harder to move between templates.

When would I use functions instead of a variable?
Essentially any time I want to re-use the result of an expression I use a variable, if I want to re-use the actual expression but get different output each time based on input, I need a function.

When do I need functions?
When I need to re-use an expression inside a loop or a lambda where the input will differ on each iteration, a variable won't work if I need to use one of the iterators as input.

Potential Solution

Building an implementation that would compile into ARM UDF would be fairly simple and enable reusing expressions even inside lambda expression without any changes to ARM.

This implementation could be iterated on in the future and be replaced or complemented with an implementation based on for examples lambdas, but for now, I think a solution based on ARM UDF will be more than enough.

A function is defined with the following syntax:
<keyword> <identifier> (params tuple) => <expression>
example:
func getName (myInput string, count int) => '${replace(myInput, '-', '')}-${count}'

This would generate the following ARM:

"functions": [
  {
    "namespace": "__bicepFuncs",
    "members": {
      "getName": {
        "parameters": [
          {
            "name": "myInput",
            "type": "string"
          },
          {
            "name": "count",
            "type": "int"
          }
        ],
        "output": {
          "type": "string",
          "value": "[format('{0}-{1}', replace(parameters('myInput'), '-', ''), parameters('count'))]"
        }
      }
    }
  }
],

The namespace for a UDF is not exposed at all in bicep, a generic namespace is used in the compiled ARM template. I'm not sure how this would be handled on decompile, but I suggest ignoring UDF namespaces on decompile.

Example today:
In this simplified example, I'm using the same replace(split())-expression twice. It would be handy if I could define it once in a function.

var fqdnList = [
  'subdomain.contoso.com'
  'subdomain.contoso-dev.com'
]

output params object = reduce(any(fqdnList), {}, (prev, cur) => union(prev, {
  '${replace(split(cur, '.')[1], '-', '')}': {
    type: 'String'
    metadata: {
      displayName: replace(split(cur, '.')[1], '-', '')
    }
  }
}))

What I would want to do:

var fqdnList = [
  'subdomain.contoso.com'
  'subdomain.contoso-dev.com'
]

func getName (myInput string) => replace(split(cur, '.')[1], '-', '')

output params object = reduce(any(fqdnList), {}, (prev, cur) => union(prev, {
  '${getName(cur)}': {
    type: 'String'
    metadata: {
      displayName: getName(cur)
    }
  }
}))
@SimonWahlin SimonWahlin added enhancement New feature or request proposal labels Dec 8, 2022
@ghost ghost added the Needs: Triage 🔍 label Dec 8, 2022
@anthony-c-martin
Copy link
Member

anthony-c-martin commented Dec 9, 2022

Since I promised some code pointers:

  • Here's where we parse top-level declarations. The func keyword would essentially just be another switch case:
    LanguageConstants.TargetScopeKeyword => this.TargetScope(leadingNodes),
    LanguageConstants.MetadataKeyword => this.MetadataDeclaration(leadingNodes),
    LanguageConstants.TypeKeyword => this.TypeDeclaration(leadingNodes),
    LanguageConstants.ParameterKeyword => this.ParameterDeclaration(leadingNodes),
    LanguageConstants.VariableKeyword => this.VariableDeclaration(leadingNodes),
    LanguageConstants.ResourceKeyword => this.ResourceDeclaration(leadingNodes),
    LanguageConstants.OutputKeyword => this.OutputDeclaration(leadingNodes),
    LanguageConstants.ModuleKeyword => this.ModuleDeclaration(leadingNodes),
    LanguageConstants.ImportKeyword => this.ImportDeclaration(leadingNodes),
  • Here's where we parse a lambda syntax. The only major difference I can see is that you'd need to parse 2 elements (<name> & <type>) between each , in the function declaration. You may want to represent the <type> <name> pair as a dedicated syntax (see later note about TypedLocalVariableSyntax):
    var (openParen, expressionsOrCommas, closeParen) = ParenthesizedExpressionList(expressionFlags);
    if (Check(TokenType.Arrow))
    {
    var arrow = this.Expect(TokenType.Arrow, b => b.ExpectedCharacter("=>"));
    var expression = this.WithRecovery(() => this.Expression(ExpressionFlags.AllowComplexLiterals), RecoveryFlags.None, TokenType.NewLine, TokenType.RightParen);
    var variableBlock = GetVariableBlock(openParen, expressionsOrCommas, closeParen);
    return new LambdaSyntax(variableBlock, arrow, expression);
    }
  • We convert declarations into symbols here - I imagine you'll want to introduce a new symbol (or modify FunctionSymbol) to represent a UDF:
    this.ImportDeclarations = declarations.OfType<ImportedNamespaceSymbol>().ToImmutableArray();
    this.MetadataDeclarations = declarations.OfType<MetadataSymbol>().ToImmutableArray();
    this.ParameterDeclarations = declarations.OfType<ParameterSymbol>().ToImmutableArray();
    this.TypeDeclarations = declarations.OfType<TypeAliasSymbol>().ToImmutableArray();
    this.VariableDeclarations = declarations.OfType<VariableSymbol>().ToImmutableArray();
    this.ResourceDeclarations = declarations.OfType<ResourceSymbol>().ToImmutableArray();
    this.ModuleDeclarations = declarations.OfType<ModuleSymbol>().ToImmutableArray();
    this.OutputDeclarations = declarations.OfType<OutputSymbol>().ToImmutableArray();
    this.ParameterAssignments = declarations.OfType<ParameterAssignmentSymbol>().ToImmutableArray();
  • We emit template output here - for UDFs, you'll want to iterate over your FunctionDeclarationSyntax entries and emit UDFs in the template:
    this.EmitMetadata(jsonWriter, emitter);
    this.EmitTypeDefinitionsIfPresent(jsonWriter, emitter);
    this.EmitParametersIfPresent(jsonWriter, emitter);
    this.EmitVariablesIfPresent(jsonWriter, emitter);
    this.EmitImports(jsonWriter, emitter);
    this.EmitResources(jsonWriter, emitter);
    this.EmitOutputsIfPresent(jsonWriter, emitter);
  • Implement type checking for the variables in the function. You'll most likely need to represent the <type> <name> as a dedicated piece of syntax (e.g. TypedLocalVariableSyntax), and add a check to this block to use <type> to resolve the declared type for the variable declared as <name>:
    case LocalVariableSyntax localVariable:
    return new DeclaredTypeAssignment(this.typeManager.GetTypeInfo(localVariable), localVariable);

    Implementing all of the above should get you the basic end-to-end working without any real validation.

The validation checks I can think we'd need are:

  • Enforce the various UDF constraints:
    • block accessing any symbols outside of the scope of the function to block var, param, func access
    • block functions that are not permitted inside UDFs
  • Type checking. This should hopefully fall into place if you implement the necessary parts, but may need a few tweaks.

@anthony-c-martin
Copy link
Member

anthony-c-martin commented Dec 9, 2022

I really like this proposal! Thoughts below:

  1. I think in the long run, the lack of ability to refer to var & param symbols outside of the scope of the function, and refer to other func calls may feel limiting, but I also think it's perfectly fine to start simple, and add this in the future.

    For example, the following would not work:

    param prefix string
    param suffix string
    
    // this will fail as `prefix` & `suffix` can't be accessed directly
    func getName(string type) => toLower('${prefix}-${type}-${suffix}')
    var vmName = getName('vm')

    And would instead have to be written as:

    param prefix string
    param suffix string
    
    func getName(string prefix, string type, string suffix) => toLower('${prefix}-${type}-${suffix}')
    // all usages of this function will have to copy+paste prefix & suffix as arguments
    var badVmName = getName(prefix, 'vm', suffix)
  2. In terms of syntax, I'm torn between what you've proposed and the following:

    func getName = (myInput string) => replace(split(cur, '.')[1], '-', '')

    Your proposal is simpler, and easier to read for someone who isn't familiar with functional programming (I suspect a large percentage of Bicep users). Including the = feels more consistent with other Bicep syntax (assigning a lambda on the right-hand side to the symbol on the left-hand side).

    I think I'm leaning slightly more towards the simpler syntax (your proposal), just thinking about catering to newcomers to the language.

@obiwanjacobi
Copy link

I would want to export functions from one bicep file to be reused (import) in another bicep file.
For exporting functions you could reuse output syntax but importing would be a new thing.
I don't think you would want to go the module route here...

I also think the syntax func myFn = (parm1 string) => <expression> is a better fit.

@zachbugay
Copy link

It might be beneficial to declare a functions return type as well.

func myFn = (param1 string): string => <expression>

@chriswue
Copy link

FWIW:

I would prefer the func fun = (param type) => <expression> syntax. I suspect the return type could be deduced automatically.

Also regarding the reference of parameters and variables: If the function references them then they could be added to the parameter list automatically and patched in for all the call sites by the compiler, couldn't they? Sure more implementation work by the compiler but should these ARM restriction ever be removed then the bicep implementation can adapt without users having to change all their templates. Right now you have have to map them as parameters explicitly. If this becomes unnecessary then I suspect a bicep warning will be added that this is unnecessary and since we will want to run under "treat warnings as errors" setup this will start breaking our pipelines and then forces us to needlessly touch gazillions of templates. So the trade-off here would be an increase in implementation complexity by a handful of people vs thousands or tens-of-thousands of templates having to be change manually.
And if the ARM restrictions never change then it will save a lot of typing work.

My main use-case currently is using consistent naming conventions in templates where resource names are parametrized by 3-4 parameters/variables and all but one just get passed in a template parameters (the one that changes between each callsite is typically the abbreviation of the specific component name).

@anthony-c-martin
Copy link
Member

I did a bit of prototyping here: https://github.com/Azure/bicep/compare/ant/exp/func - I left various TODO(functions) comments in places where I hacked things together or implementation is missing.

Regarding declaring types: I think to support the current behavior, we would need to force users to declare the output type of a function.

anthony-c-martin added a commit that referenced this issue May 1, 2023
See
https://github.com/Azure/bicep/blob/ant/exp/func/src/Bicep.Core.Samples/Files/Functions_LF/main.bicep
for a syntax example.

Notes/limitations:
* UDFs in JSON require declaring the return type, so I've modified the
spec to require the user to declare the return type.
* User-defined types in UDF params & outputs are unsupported.
* UDFs are currently quite limited - they can't call UDFs, or access
outer scoped variables.
* This is currently behind an experimental feature flag
("userDefinedFunctions")

Closes #447
Closes #9239

###### Microsoft Reviewers: [Open in
CodeFlow](https://portal.fabricbot.ms/api/codeflow?pullrequest=https://github.com/Azure/bicep/pull/10465)
@ghost ghost locked as resolved and limited conversation to collaborators Jun 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants