Skip to content

Commit

Permalink
Add support for plucking values with @
Browse files Browse the repository at this point in the history
Co-authored-by: Mingun <Alexander_Sergey@mail.ru>
  • Loading branch information
hildjj and Mingun committed Apr 21, 2021
1 parent 0b7a199 commit f8a44d5
Show file tree
Hide file tree
Showing 22 changed files with 1,275 additions and 875 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,20 @@ must be a JavaScript identifier.
Labeled expressions are useful together with actions, where saved match results
can be accessed by action's JavaScript code.

#### _@_ ( _label_ : )? _expression_

Match the expression and if the label exists, remember its match result under
given label. The label must be a JavaScript identifier if it exists.

Return the value of this expression from the rule, or "pluck" it. You may not
have an action for this rule. The expression must not be a semantic predicate
(&{predicate} or !{predicate}). There may be multiple pluck expressions in a
given rule, in which case an array of the plucked expressions is returned from
the rule.

Pluck expressions are useful for writing terse grammars, or returning parts of
an expression that is wrapped in parentheses.

#### _expression<sub>1</sub>_ _expression<sub>2</sub>_ ... _expression<sub>n</sub>_

Match a sequence of expressions and return their match results in an array.
Expand Down
33 changes: 32 additions & 1 deletion docs/documentation.html
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ <h2 id="table-of-contents">Table of Contents</h2>
<a href="#grammar-syntax-and-semantics">Grammar Syntax and Semantics</a>
<ul>
<li><a href="#grammar-syntax-and-semantics-parsing-expression-types">Parsing Expression Types</a></li>
<li><a href="#parsing-lists">Parsing Lists</a></li>
</ul>
</li>
<li><a href="#compatibility">Compatibility</a></li>
Expand Down Expand Up @@ -283,7 +284,7 @@ <h2 id="using-the-parser">Using the Parser</h2>
console.log(options.validWords); // outputs "[ 'boo', 'baz', 'boop' ]"
}

validWord = word:$[a-z]+ &{ return options.validWords.includes(word) } { return word }
validWord = @word:$[a-z]+ &{ return options.validWords.includes(word) }
`);

const result = parser.parse("boo", {
Expand Down Expand Up @@ -594,6 +595,24 @@ <h3 id="grammar-syntax-and-semantics-parsing-expression-types">Parsing Expressio
results can be accessed by action's JavaScript code.</p>
</dd>

<dt><code><em>@</em> (<em>label</em>:)? <em>expression</em></code></dt>

<dd>
<p>Match the expression and if the label exists, remember its match result
under given label. The label must be a JavaScript identifier if it
exists.</p>

<p>Return the value of this expression from the rule, or "pluck" it. You
may not have an action for this rule. The expression must not be a
semantic predicate (<code>&{predicate}</code> or
<code>!{predicate}</code>). There may be multiple pluck expressions in a
given rule, in which case an array of the plucked expressions is returned
from the rule.</p>

<p>Pluck expressions are useful for writing terse grammars, or returning
parts of an expression that is wrapped in parentheses.</p>
</dd>

<dt><code><em>expression<sub>1</sub></em> <em>expression<sub>2</sub></em> ... <em>expression<sub>n</sub></em></code></dt>

<dd>
Expand Down Expand Up @@ -661,6 +680,18 @@ <h3 id="grammar-syntax-and-semantics-parsing-expression-types">Parsing Expressio
</dd>
</dl>

<h3 id="parsing-lists">Parsing Lists</h3>

<p>One of the most frequent questions about Peggy grammars is how to parse a
delimited list of items. The cleanest current approach is:</p>

<pre><code>list = head:word tail:(_ "," _ @word)* { return [head].concat(tail); }
word = $[a-z]i+
_ = [ \t]*</code></pre>

<p>Note that the <code>@</code> in the tail section plucks the word out of the
parentheses, NOT out of the rule itself.</p>

<h2 id="compatibility">Compatibility</h2>

<p>Both the parser generator and generated parsers should run well in the
Expand Down
47 changes: 25 additions & 22 deletions docs/js/benchmark-bundle.min.js

Large diffs are not rendered by default.

426 changes: 216 additions & 210 deletions docs/js/test-bundle.min.js

Large diffs are not rendered by default.

47 changes: 25 additions & 22 deletions docs/vendor/peggy/peggy.min.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion examples/arithmetics.pegjs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Term
}

Factor
= "(" _ expr:Expression _ ")" { return expr; }
= "(" _ @Expression _ ")"
/ Integer

Integer "integer"
Expand Down
22 changes: 11 additions & 11 deletions examples/css.pegjs
Original file line number Diff line number Diff line change
Expand Up @@ -80,10 +80,10 @@ media
}

media_list
= head:medium tail:("," S* medium)* { return buildList(head, tail, 2); }
= head:medium tail:("," S* @medium)* { return [head].concat(tail); }

medium
= name:IDENT S* { return name; }
= @IDENT S*

page
= PAGE_SYM S* selector:pseudo_page?
Expand All @@ -103,15 +103,15 @@ pseudo_page
= ":" value:IDENT S* { return { type: "PseudoSelector", value: value }; }

operator
= "/" S* { return "/"; }
/ "," S* { return ","; }
= @"/" S*
/ @"," S*

combinator
= "+" S* { return "+"; }
/ ">" S* { return ">"; }
= @"+" S*
/ @">" S*

property
= name:IDENT S* { return name; }
= @IDENT S*

ruleset
= selectorsHead:selector
Expand Down Expand Up @@ -258,7 +258,7 @@ unicode

escape
= unicode
/ "\\" ch:[^\r\n\f0-9a-f]i { return ch; }
/ "\\" @[^\r\n\f0-9a-f]i

nmstart
= [_a-z]i
Expand Down Expand Up @@ -411,8 +411,8 @@ NUMBER "number"
= comment* value:num { return { value: value, unit: null }; }

URI "uri"
= comment* U R L "("i w url:string w ")" { return url; }
/ comment* U R L "("i w url:url w ")" { return url; }
= comment* U R L "("i w @string w ")"
/ comment* U R L "("i w @url w ")"

FUNCTION "function"
= comment* name:ident "(" { return name; }
= comment* @ident "("
6 changes: 3 additions & 3 deletions examples/json.pegjs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
// ----- 2. JSON Grammar -----

JSON_text
= ws value:value ws { return value; }
= ws @value ws

begin_array = ws "[" ws
begin_object = ws "{" ws
Expand Down Expand Up @@ -48,7 +48,7 @@ object
= begin_object
members:(
head:member
tail:(value_separator m:member { return m; })*
tail:(value_separator @member)*
{
var result = {};

Expand All @@ -73,7 +73,7 @@ array
= begin_array
values:(
head:value
tail:(value_separator v:value { return v; })*
tail:(value_separator @value)*
{ return [head].concat(tail); }
)?
end_array
Expand Down
4 changes: 3 additions & 1 deletion lib/compiler/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ const reportDuplicateRules = require("./passes/report-duplicate-rules");
const reportInfiniteRecursion = require("./passes/report-infinite-recursion");
const reportInfiniteRepetition = require("./passes/report-infinite-repetition");
const reportUndefinedRules = require("./passes/report-undefined-rules");
const reportIncorrectPlucking = require("./passes/report-incorrect-plucking");
const visitor = require("./visitor");

function processOptions(options, defaults) {
Expand Down Expand Up @@ -42,7 +43,8 @@ const compiler = {
reportDuplicateRules: reportDuplicateRules,
reportDuplicateLabels: reportDuplicateLabels,
reportInfiniteRecursion: reportInfiniteRecursion,
reportInfiniteRepetition: reportInfiniteRepetition
reportInfiniteRepetition: reportInfiniteRepetition,
reportIncorrectPlucking: reportIncorrectPlucking
},
transform: {
removeProxyRules: removeProxyRules
Expand Down
9 changes: 9 additions & 0 deletions lib/compiler/opcodes.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ const opcodes = {
APPEND: 10, // APPEND
WRAP: 11, // WRAP n
TEXT: 12, // TEXT
PLUCK: 36, // PLUCK n, k, p1, ..., pK

// Conditions and Loops

Expand Down Expand Up @@ -49,6 +50,14 @@ const opcodes = {

SILENT_FAILS_ON: 28, // SILENT_FAILS_ON
SILENT_FAILS_OFF: 29 // SILENT_FAILS_OFF

// Because the tests have hard-coded opcode numbers, don't renumber
// existing opcodes. New opcodes that have been put in the correct
// sections above are repeated here in order to ensure we don't
// reuse them.
//
// 30-35 reserved for @mingun
// PLUCK: 36
};

module.exports = opcodes;
33 changes: 31 additions & 2 deletions lib/compiler/passes/generate-bytecode.js
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,15 @@ const visitor = require("../visitor");
//
// stack.push(input.substring(stack.pop(), currPos));
//
// [36] PLUCK n, k, p1, ..., pK
//
// value = [stack[p1], ..., stack[pK]]; // when k != 1
// -or-
// value = stack[p1]; // when k == 1
//
// stack.pop(n);
// stack.push(value);
//
// Conditions and Loops
// --------------------
//
Expand Down Expand Up @@ -298,6 +307,7 @@ function generateBytecode(ast) {
node.bytecode = generate(node.expression, {
sp: -1, // stack pointer
env: { }, // mapping of label names to stack positions
pluck: [], // fields that have been picked
action: null // action nodes pass themselves to children here
});
},
Expand Down Expand Up @@ -381,13 +391,15 @@ function generateBytecode(ast) {
generate(elements[0], {
sp: context.sp,
env: context.env,
pluck: context.pluck,
action: null
}),
buildCondition(
[op.IF_NOT_ERROR],
buildElementsCode(elements.slice(1), {
sp: context.sp + 1,
env: context.env,
pluck: context.pluck,
action: context.action
}),
buildSequence(
Expand All @@ -398,6 +410,13 @@ function generateBytecode(ast) {
)
);
} else {
if (context.pluck.length > 0) {
return buildSequence(
[ op.PLUCK, node.elements.length + 1, context.pluck.length ],
context.pluck.map(eSP => context.sp - eSP)
);
}

if (context.action) {
const functionIndex = addFunctionConst(
Object.keys(context.env),
Expand Down Expand Up @@ -425,15 +444,25 @@ function generateBytecode(ast) {
buildElementsCode(node.elements, {
sp: context.sp + 1,
env: context.env,
pluck: [],
action: context.action
})
);
},

labeled(node, context) {
const env = cloneEnv(context.env);
let env = context.env;
const label = node.label;
const sp = context.sp + 1;

context.env[node.label] = context.sp + 1;
if (label) {
env = cloneEnv(context.env);
context.env[node.label] = sp;
}

if (node.pick) {
context.pluck.push(sp);
}

return generate(node.expression, {
sp: context.sp,
Expand Down
34 changes: 33 additions & 1 deletion lib/compiler/passes/generate-js.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
const asts = require("../asts");
const js = require("../js");
const op = require("../opcodes");
const VERSION = require('../../version')
const VERSION = require("../../version");

// Generates parser JavaScript code.
function generateJS(ast, options) {
Expand Down Expand Up @@ -296,6 +296,22 @@ function generateJS(ast, options) {
" ip++;",
" break;",
"",
" case " + op.PLUCK + ": {", // PLUCK n, k, p1, ..., pK
" params = bc.slice(ip + 3, ip + 3 + bc[ip + 2]);",
" params = bc[ip + 2] === 1",
" ? stack[stack.length - 1 - params[0]]",
" : params.map(function(p) { return stack[stack.length - 1 - p]; });",
"",
" stack.splice(",
" stack.length - bc[ip + 1],",
" bc[ip + 1],",
" params",
" );",
"",
" ip += bc[ip + 2] + 3;",
" break;",
" }",
"",
" case " + op.IF + ":", // IF t, f
indent10(generateCondition("stack[stack.length - 1]", 0)),
"",
Expand Down Expand Up @@ -594,6 +610,22 @@ function generateJS(ast, options) {
ip++;
break;

case op.PLUCK: { // PLUCK n, k, p1, ..., pK
const baseLength = 3;
const paramsLength = bc[ip + baseLength - 1];
const n = baseLength + paramsLength;
value = bc.slice(ip + baseLength, ip + n);
value = paramsLength === 1
? stack.index(value[0])
: `[ ${
value.map(p => stack.index(p)).join(", ")
} ]`;
stack.pop(bc[ip + 1]);
parts.push(stack.push(value));
ip += n;
break;
}

case op.IF: // IF t, f
compileCondition(stack.top(), 0);
break;
Expand Down
7 changes: 4 additions & 3 deletions lib/compiler/passes/report-duplicate-labels.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,12 @@ function reportDuplicateLabels(ast) {
action: checkExpressionWithClonedEnv,

labeled(node, env) {
if (Object.prototype.hasOwnProperty.call(env, node.label)) {
const label = node.label;
if (label && Object.prototype.hasOwnProperty.call(env, label)) {
throw new GrammarError(
"Label \"" + node.label + "\" is already defined "
+ "at line " + env[node.label].start.line + ", "
+ "column " + env[node.label].start.column + ".",
+ "at line " + env[label].start.line + ", "
+ "column " + env[label].start.column + ".",
node.location
);
}
Expand Down
Loading

0 comments on commit f8a44d5

Please sign in to comment.