Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt-in automatic collection of expected values on alt #415

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
242 changes: 242 additions & 0 deletions src/error.rs → src/error/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,14 @@ use crate::lib::std::fmt;
use core::num::NonZeroUsize;

use crate::stream::AsBStr;
use crate::stream::AsOrd;
use crate::stream::Stream;
#[allow(unused_imports)] // Here for intra-doc links
use crate::Parser;

#[cfg(test)]
mod tests;

/// For use with [`Parser::parse_peek`] which allows the input stream to be threaded through a
/// parser.
///
Expand Down Expand Up @@ -227,6 +231,21 @@ impl<I, C, E: AddContext<I, C>> AddContext<I, C> for ErrMode<E> {
}
}

impl<E: MergeContext> MergeContext for ErrMode<E> {
#[inline(always)]
fn merge_context(self, other: Self) -> Self {
match other.into_inner() {
Some(other) => self.map(|err| err.merge_context(other)),
None => self,
}
}

#[inline]
fn clear_context(self) -> Self {
self.map(MergeContext::clear_context)
}
}

impl<T: Clone> ErrMode<InputError<T>> {
/// Maps `ErrMode<InputError<T>>` to `ErrMode<InputError<U>>` with the given `F: T -> U`
pub fn map_input<U: Clone, F>(self, f: F) -> ErrMode<InputError<U>>
Expand Down Expand Up @@ -313,6 +332,15 @@ pub trait AddContext<I, C = &'static str>: Sized {
}
}

/// Merge contexts while backtracking.
pub trait MergeContext: Sized {
/// Apply the context from `other` into `self`
fn merge_context(self, _other: Self) -> Self;

/// Remove all context
fn clear_context(self) -> Self;
}

/// Create a new error with an external error, from [`std::str::FromStr`]
///
/// This trait is required by the [`Parser::try_map`] combinator.
Expand Down Expand Up @@ -536,6 +564,32 @@ impl<C, I> AddContext<I, C> for ContextError<C> {
}
}

impl<C> MergeContext for ContextError<C> {
#[inline]
fn merge_context(mut self, other: Self) -> Self {
// self and other get consumed to produce the new Context error.
// We choose the vector with the larger capacity to reduce the chance of reallocations.
#[cfg(feature = "alloc")]
{
let (mut context, other) = if self.context.capacity() >= other.context.capacity() {
(self.context, other.context)
} else {
(other.context, self.context)
};
context.extend(other);
self.context = context;
}
self
}

#[inline]
fn clear_context(mut self) -> Self {
#[cfg(feature = "alloc")]
self.context.clear();
self
}
}

#[cfg(feature = "std")]
impl<C, I, E: std::error::Error + Send + Sync + 'static> FromExternalError<I, E>
for ContextError<C>
Expand Down Expand Up @@ -695,6 +749,194 @@ impl crate::lib::std::fmt::Display for StrContextValue {
}
}

/// Collect context of the longest matching parser while backtracking on errors
#[derive(Clone, Debug)]
pub struct LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: MergeContext,
{
checkpoint: Option<<I as Stream>::Checkpoint>,
inner: E,
}

impl<I, E> LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: MergeContext + Default,
{
/// Create an empty error
pub fn new() -> Self {
Self {
checkpoint: None,
inner: Default::default(),
}
}
}
Comment on lines +764 to +777
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we offer an empty error?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And is this the only reason checkpoint is an Option?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for checkpoint to be Option is the LongestMatch currently implements the MergeContext trait as well. When calling clear_context we must set checkpoint: None, such that the empty context is always < then any non-empty context. Otherwise merging 2 context would give us some inconsistent output.

I'm removing impl MergeContext for LongestMatch, then we should get rid of the Option for checkpoint.


// For tests
impl<I, E> core::cmp::PartialEq for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd + core::cmp::PartialEq,
E: MergeContext + core::cmp::PartialEq,
{
fn eq(&self, other: &Self) -> bool {
self.checkpoint == other.checkpoint && self.inner == other.inner
}
}
Comment on lines +779 to +789
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? ContextError needs it because its the default and so we need this for testing parsers. Here, we likely shouldn't be doing equality check tests

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are used by the tests.

As LongestMatch is no error on itself itself, but a decorator most likely used with ContextError I did feel like being able to compare LongestMatch<ContextError> would be the right thing to have.


impl<I, E> LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: MergeContext,
{
/// Extract the error for the longest matching parser
#[inline]
pub fn into_inner(self) -> E {
self.inner
}

#[inline]
fn cmp_checkpoints(&self, other: &Self) -> core::cmp::Ordering {
let a = self.checkpoint.as_ref().map(|c| c.as_ord());
let b = other.checkpoint.as_ref().map(|c| c.as_ord());
a.cmp(&b)
}

#[inline]
fn cmp_with_checkpoint(&self, other: &<I as Stream>::Checkpoint) -> core::cmp::Ordering {
match self.checkpoint {
None => core::cmp::Ordering::Less,
Some(ref c) => c.as_ord().cmp(&other.as_ord()),
}
}
}

impl<I, E> Default for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: MergeContext + Default,
{
fn default() -> Self {
Self::new()
}
}

impl<I, E> ParserError<I> for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: ParserError<I> + MergeContext,
{
#[inline]
fn from_error_kind(input: &I, kind: ErrorKind) -> Self {
Self {
checkpoint: Some(input.checkpoint()),
inner: E::from_error_kind(input, kind),
}
}

#[inline]
fn append(mut self, input: &I, kind: ErrorKind) -> Self {
let checkpoint = input.checkpoint();
match self.cmp_with_checkpoint(&checkpoint) {
Comment on lines +845 to +847
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this needed?

core::cmp::Ordering::Less => {
self.checkpoint = Some(checkpoint);
self.inner = self.inner.clear_context().append(input, kind);
}
core::cmp::Ordering::Equal => {
self.inner = self.inner.append(input, kind);
}
core::cmp::Ordering::Greater => {}
}
self
}

#[inline]
fn or(self, other: Self) -> Self {
self.merge_context(other)
epage marked this conversation as resolved.
Show resolved Hide resolved
}
}

impl<I, E, C> AddContext<I, C> for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: AddContext<I, C> + MergeContext,
{
#[inline]
fn add_context(mut self, input: &I, ctx: C) -> Self {
let checkpoint = input.checkpoint();
match self.cmp_with_checkpoint(&checkpoint) {
core::cmp::Ordering::Less => {
self.checkpoint = Some(checkpoint);
self.inner = self.inner.clear_context().add_context(input, ctx);
}
core::cmp::Ordering::Equal => {
self.inner = self.inner.add_context(input, ctx);
}
core::cmp::Ordering::Greater => {}
}
self
}
}

impl<I, E: MergeContext> MergeContext for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
{
#[inline]
fn merge_context(mut self, other: Self) -> Self {
match self.cmp_checkpoints(&other) {
core::cmp::Ordering::Less => other,
core::cmp::Ordering::Greater => self,
core::cmp::Ordering::Equal => {
self.inner = self.inner.merge_context(other.inner);
self
}
}
}

fn clear_context(self) -> Self {
Self {
checkpoint: None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to reset it when clearing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment here #415 (comment)

I'm removing impl MergeContext or LongestMatch. This also removes the need or checkpoint to be Option.

inner: self.inner.clear_context(),
}
}
}

impl<I, E, EX> FromExternalError<I, EX> for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: FromExternalError<I, EX> + MergeContext,
{
#[inline]
fn from_external_error(input: &I, kind: ErrorKind, e: EX) -> Self {
Self {
checkpoint: Some(input.checkpoint()),
inner: E::from_external_error(input, kind, e),
}
}
}

impl<I, E> crate::lib::std::fmt::Display for LongestMatch<I, E>
where
I: Stream,
<I as Stream>::Checkpoint: AsOrd,
E: crate::lib::std::fmt::Display + MergeContext,
{
fn fmt(&self, f: &mut crate::lib::std::fmt::Formatter<'_>) -> crate::lib::std::fmt::Result {
self.inner.fmt(f)
}
}

/// Trace all error paths, particularly for tests
#[derive(Debug)]
#[cfg(feature = "std")]
Expand Down
Loading
Loading