-
Notifications
You must be signed in to change notification settings - Fork 4
Bootstrap AST and parser #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
// Copyright (c) 2021 Brendan Molloy <brendan@bbqsrc.net>, | ||
// Ilya Solovyiov <ilya.solovyiov@gmail.com>, | ||
// Kai Ren <tyranron@gmail.com> | ||
// | ||
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or | ||
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license | ||
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your | ||
// option. This file may not be copied, modified, or distributed | ||
// except according to those terms. | ||
|
||
//! [Cucumber Expressions][1] [AST]. | ||
//! | ||
//! See details in the [grammar spec][0]. | ||
//! | ||
//! [0]: crate#grammar | ||
//! [1]: https://github.com/cucumber/cucumber-expressions#readme | ||
//! [AST]: https://en.wikipedia.org/wiki/Abstract_syntax_tree | ||
|
||
use derive_more::{AsRef, Deref, DerefMut}; | ||
use nom::{error::ErrorKind, Err, InputLength}; | ||
use nom_locate::LocatedSpan; | ||
|
||
use crate::parse; | ||
|
||
/// [`str`] along with its location information in the original input. | ||
pub type Spanned<'s> = LocatedSpan<&'s str>; | ||
|
||
/// Top-level `expression` defined in the [grammar spec][0]. | ||
/// | ||
/// See [`parse::expression()`] for the detailed grammar and examples. | ||
/// | ||
/// [0]: crate#grammar | ||
#[derive(AsRef, Clone, Debug, Deref, DerefMut, Eq, PartialEq)] | ||
pub struct Expression<Input>(pub Vec<SingleExpression<Input>>); | ||
|
||
impl<'s> TryFrom<&'s str> for Expression<Spanned<'s>> { | ||
type Error = parse::Error<Spanned<'s>>; | ||
|
||
fn try_from(value: &'s str) -> Result<Self, Self::Error> { | ||
parse::expression(Spanned::new(value)) | ||
.map_err(|e| match e { | ||
Err::Error(e) | Err::Failure(e) => e, | ||
Err::Incomplete(n) => parse::Error::Needed(n), | ||
}) | ||
.and_then(|(rest, parsed)| { | ||
rest.is_empty() | ||
.then(|| parsed) | ||
.ok_or(parse::Error::Other(rest, ErrorKind::Verify)) | ||
}) | ||
} | ||
} | ||
|
||
impl<'s> Expression<Spanned<'s>> { | ||
/// Parses the given `input` as an [`Expression`]. | ||
/// | ||
/// # Errors | ||
/// | ||
/// See [`parse::Error`] for details. | ||
pub fn parse<I: AsRef<str> + ?Sized>( | ||
input: &'s I, | ||
) -> Result<Self, parse::Error<Spanned<'s>>> { | ||
Self::try_from(input.as_ref()) | ||
} | ||
} | ||
|
||
/// `single-expression` defined in the [grammar spec][0], representing a single | ||
/// entry of an [`Expression`]. | ||
/// | ||
/// See [`parse::single_expression()`] for the detailed grammar and examples. | ||
/// | ||
/// [0]: crate#grammar | ||
#[derive(Clone, Debug, Eq, PartialEq)] | ||
pub enum SingleExpression<Input> { | ||
/// [`alternation`][0] expression. | ||
/// | ||
/// [0]: crate#grammar | ||
Alternation(Alternation<Input>), | ||
|
||
/// [`optional`][0] expression. | ||
/// | ||
/// [0]: crate#grammar | ||
Optional(Optional<Input>), | ||
|
||
/// [`parameter`][0] expression. | ||
/// | ||
/// [0]: crate#grammar | ||
Parameter(Parameter<Input>), | ||
|
||
/// Text without whitespaces. | ||
Text(Input), | ||
|
||
/// Whitespaces are treated as a special case to avoid placing every `text` | ||
/// character in a separate [AST] node, as described in the | ||
/// [grammar spec][0]. | ||
/// | ||
/// [0]: crate#grammar | ||
/// [AST]: https://en.wikipedia.org/wiki/Abstract_syntax_tree | ||
Whitespaces(Input), | ||
} | ||
|
||
/// `single-alternation` defined in the [grammar spec][0], representing a | ||
/// building block of an [`Alternation`]. | ||
/// | ||
/// [0]: crate#grammar | ||
pub type SingleAlternation<Input> = Vec<Alternative<Input>>; | ||
|
||
/// `alternation` defined in the [grammar spec][0], allowing to match one of | ||
/// [`SingleAlternation`]s. | ||
/// | ||
/// See [`parse::alternation()`] for the detailed grammar and examples. | ||
/// | ||
/// [0]: crate#grammar | ||
#[derive(AsRef, Clone, Debug, Deref, DerefMut, Eq, PartialEq)] | ||
pub struct Alternation<Input>(pub Vec<SingleAlternation<Input>>); | ||
|
||
impl<Input: InputLength> Alternation<Input> { | ||
/// Returns length of this [`Alternation`]'s span in the `Input`. | ||
pub(crate) fn span_len(&self) -> usize { | ||
self.0 | ||
.iter() | ||
.flatten() | ||
.map(|alt| match alt { | ||
Alternative::Text(t) => t.input_len(), | ||
Alternative::Optional(opt) => opt.input_len() + 2, | ||
}) | ||
.sum::<usize>() | ||
+ self.len() | ||
- 1 | ||
} | ||
|
||
/// Indicates whether any of [`SingleAlternation`]s consists only from | ||
/// [`Optional`]s. | ||
pub(crate) fn contains_only_optional(&self) -> bool { | ||
(**self).iter().any(|single_alt| { | ||
single_alt | ||
.iter() | ||
.all(|alt| matches!(alt, Alternative::Optional(_))) | ||
}) | ||
} | ||
} | ||
|
||
/// `alternative` defined in the [grammar spec][0]. | ||
/// | ||
/// See [`parse::alternative()`] for the detailed grammar and examples. | ||
/// | ||
/// [0]: crate#grammar | ||
#[derive(Clone, Copy, Debug, Eq, PartialEq)] | ||
pub enum Alternative<Input> { | ||
/// [`optional`][1] expression. | ||
/// | ||
/// [1]: crate#grammar | ||
Optional(Optional<Input>), | ||
|
||
/// Text. | ||
Text(Input), | ||
} | ||
|
||
/// `optional` defined in the [grammar spec][0], allowing to match an optional | ||
/// `Input`. | ||
/// | ||
/// See [`parse::optional()`] for the detailed grammar and examples. | ||
/// | ||
/// [0]: crate#grammar | ||
#[derive(AsRef, Clone, Copy, Debug, Deref, DerefMut, Eq, PartialEq)] | ||
pub struct Optional<Input>(pub Input); | ||
|
||
/// `parameter` defined in the [grammar spec][0], allowing to match some special | ||
/// `Input` described by a [`Parameter`] name. | ||
/// | ||
/// See [`parse::parameter()`] for the detailed grammar and examples. | ||
/// | ||
/// [0]: crate#grammar | ||
#[derive(AsRef, Clone, Copy, Debug, Deref, DerefMut, Eq, PartialEq)] | ||
pub struct Parameter<Input>(pub Input); |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deviates from the original grammar by missing the
parameter
variant.Need to discuss this on voice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tyranron there is a
Note:
section which saysBasically
ARCHITECTURE.md
describes AST that tries to be wider thanCucumber Expression
for some reason (the only I can think of is to make parser implementation simpler and then error on conversion to realCucumber Expression
).But in reality it doesn't make much sense to me. Especially
alternative
definitionThis grammar suggests that
alternative
may have unescaped whitespaces, which is not true: example. They have to be escaped at least to avoid ambiguity.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not even EBNF, as I understand, as wikipedia says that repetition is described with
{...}
and not with(...)*
. It looks more like regex, but still has,
for concatenation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilslv would you be so kind to make a PR to upstream that adjusts the described grammar to be accurate and precise enough. Because it really bothers: having spec which doesn't reflect reality, while implementations don't follow the spec 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilslv
From your link:
*
is a repetition, and( ... )
is grouping. So we have a group repetion here. I don't see any mistakes in that. And in Markdown it has```ebnf
notation 🤷♂️