Skip to content
This repository has been archived by the owner on Jun 15, 2021. It is now read-only.

Commit

Permalink
docs: added grammar doc
Browse files Browse the repository at this point in the history
  • Loading branch information
OmarTawfik committed Mar 6, 2019
1 parent 3e3cff2 commit 76a96f9
Show file tree
Hide file tree
Showing 3 changed files with 134 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .prettierrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"printWidth": 120,
"proseWrap": "always"
}
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
# workflow-parser-js

A fast compact parser for GitHub workflows.

- [Grammar](./docs/grammar.md)
127 changes: 127 additions & 0 deletions docs/grammar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Workflow Grammar

This grammar is based on the published by
[actions/workflow-parser](https://github.com/actions/workflow-parser/blob/master/language.md). Compiling a `.workflow`
file is divided into three phases:

1. **Scanning**: where text is divided into invididual tokens, marked by a position, length and a type. Missing
2. **Parsing**: an initial syntax tree is constructed from the token stream. Parsing is error-tolerant and prefers to
construct partially valid trees in order to report diagnostics in the next phase.
3. **Binding**: where a complete list of symbols is compiled, and any advanced analysis/error reporting is done.

A compilation holds the results of these operations. The rest of this document describes them in detail.

## Scanning

The scanner produces a list of the tokens below from a document text. Each token has a start position, a length, and a
type. The valid whitespace characters are ' ', '\n', '\r', and '\t'.

```g4
VERSION_KEYWORD : 'version' ;
WORKFLOW_KEYWORD : 'workflow' ;
ACTION_KEYWORD : 'action';
ON_KEYWORD : 'on' ;
RESOLVES_KEYWORD : 'resolves' ;
USES_KEYWORD : 'uses' ;
NEEDS_KEYWORD : 'needs';
RUNS_KEYWORD : 'runs' ;
ARGS_KEYWORD : 'args' ;
ENV_KEYWORD : 'env' ;
SECRETS_KEYWORD : 'secrets' ;
EQUAL : '=' ;
COMMA : ',' ;
LEFT_CURLY_BRACKET = '{' ;
RIGHT_CURLY_BRACKET = '}' ;
LEFT_SQUARE_BRACKET = '[' ;
RIGHT_SQUARE_BRACKET = ']' ;
IDENTIFIER : [a-zA-Z_] [a-zA-Z0-9_]*;
LINE_COMMENT : ('#' | '//') ~[\r\n]* ;
INTEGER_LITERAL : [0-9]+ ;
STRING_LITERAL : '"' (('\\' ["\\/bfnrt]) | ~["\\\u0000-\u001F\u007F])* '"' ;
```

Two additional token types are created by the parser:

- **UNRECOGNIZED_TOKEN** when a character not supported by the grammar is encountered.
- **MISSING_TOKEN** when a token was expected at a certain location, but was not found. These tokens have zero length.

## Parsing

The parser will create a high-level list of blocks, without actually making decisions about which key-value pairs are
legal under which parents. It tries to leave as much work as possible to the binding phase. Each syntax node holds
comment tokens appearing before it. The main document node holds comment tokens appearing after it.

```g4
workflow_file : (version | block)* ;
version : VERSION_KEYWORD EQUAL INTEGER_LITERAL ;
block : block_type STRING_LITERAL LEFT_CURLY_BRACKET
(key_value_pair)*
RIGHT_CURLY_BRACKET ;
block_type : WORKFLOW_KEYWORD | ACTION_KEYWORD ;
key_value_pair : key EQUAL value ;
key : ON_KEYWORD
| RESOLVES_KEYWORD
| USES_KEYWORD
| NEEDS_KEYWORD
| RUNS_KEYWORD
| ARGS_KEYWORD
| ENV_KEYWORD
| SECRETS_KEYWORD ;
value : STRING_LITERAL | string_array | env_variables ;
string_array : LEFT_SQUARE_BRACKET
((STRING_LITERAL COMMA)* STRING_LITERAL COMMA?)?
RIGHT_SQUARE_BRACKET ;
env_variables : LEFT_CURLY_BRACKET (env_variable)* RIGHT_CURLY_BRACKET ;
env_variable : IDENTIFIER EQUAL STRING_LITERAL ;
```

## Binding

It takes the high-level parse tree, holding to their original syntax nodes and comments, and produces the following
structure. It also validates that:

1. Version number is supported, and is declared (if any) at the correct location.
2. All key value pairs are correct, and under the right type of block.
3. Complex values like Docker and GitHub URLs are valid.
4. Environment values and secrets are unique, and have correct keys.
5. No circular dependencies in the action graph.

```typescript
type Document {
version? number;
workflows: Workflow[];
actions: Action[];
}

type Workflow {
name: string;
on: string;
resolves: string[];
}

type Action {
name: string;
uses: string;
needs: string[];
runs: string[];
args: string[];
env: {
[key: string] : string;
};
secrets: string[];
}
```

0 comments on commit 76a96f9

Please sign in to comment.