This release fixes an incompatibility with regex.
- faster compilation time by up to 2x
- expected tokens in failed parses are more accurate
- support for unicode when using builtin tokenizer
- fix a compatibility issue with regex 1.8
- fewer warnings about clippy/unused imports in generated code
- updated to edition 2021
- updated mdbook
This release addresses backwards compatibility warnings and security alerts, along with a few smaller features.
Special thanks to our newest maintainers, Yann Hamdaoui and YunWon Jeong for helping to coordinate this release.
- stream errors to the terminal as they are computed (thanks to Anders Kaseorg!)
- lex raw identifiers (thanks to Karl Meakin!)
- upgrade to the 2018 edition (thanks to blakehawkins!)
- fix bug in lane table state splitting (thanks Anders Kaseorg!)
- fix warnings, typos, and links (thanks ggsh, Kian-Meng Ang, Ian Alexander Joiner, absurdhero, and Ömer Sinan Ağacan!)
- fix licensing to be spdx-conformant (thanks chayleaf!)
- fixs to the whitespace example (thanks Thalia Archibald!)
- update
regex
to 1.5.5 - update
thread-local
to 1.1.7 - update to
is-terminal
fromatty
(thanks kpcyrd!)
- update petgraph to v0.6.0 (3b482293)
- Fix inlining of fallible productions
- Update dependencies
- Allow string literals in patterns #557
- Enable precedence annotations #555
- Shake dependency tree #559
- Reduce work for LLVMs inliner
- Move the symbol mismatch panic into a colder path (0c69e999)
- Avoid subtracting in goto (8a47ed8c)
- Emit the GOTO table as nested matches (c5070af2)
- parse_table: Avoid generating unused rows in the matrix (688b9193)
- Use FnMut/FnOnce in ParseErrors map functions (8f73c9dc)
- Don't include whitespace in the span with empty nonterminals (11a50e70)
- Remove eprintln which I thought were removed (a9a775eb)
- Allow the tokenizer to contain custom skip regexes/literals (ee2f7060)
- states does not need to be passed to reduce actions (c156b4b2)
- action does not need to be passed to reduce actions (f69bce30)
- Only generate simulate_reduce if error recovery is used (d0a3ccba)
- Accept slices as types (#507) (c3e1cda5, closes #493)
- The
lexer
feature is now necessary whenlalrpop
generates the lexer.
- Add support for allowing for mutable x in action code.
- The minimum Rust version has been updated to 1.32.0 to fix deprecations
- Split apart UnrecognizedEOF variant from UnrecognizedToken (#446)
- Formatting and clippy warnings have been fixed
Thanks to the following contributors:
- @ra-kete
- @Eijebong
- @rofrol
- @mikeyhew
- @jwinnie
- @jespersm
- @nwtwnni
- @Songbird0
- Avoid some memcpying in reductions (4968e5a6)
- Allow the deprecated use of trim_left (bdd65184, closes #428)
- Don't make generated files read-only (0c67bbed)
- Fix type annotation for inline actions
Thanks to the following contributors:
- @tjade273
- Add a setting to strip indentation from the generated grammar (9f3a978f)
- Allow setting features from the commandline (c8df4987)
- Don't depend on lalrpop-snap to compile lalrpop (3ff1b4c4)
- Allow setting out_dir on the command line (9c26e517)
- Let parse rules be conditionally compiled (e6b6a07f)
- Process escape sequences in string literals appearing as terminals (0b7e1e1d)
Thanks to the following contributors:
- @Marwes
- @jimblandy
Features:
- Allow attributes to be specified in larlpop_mod! (#398)
Fixes:
- Don't generate reduce actions which do not fit in the integer size (#399)
- Generate files in OUT_DIR (#353)
Dependencies:
- Update atty (0.2), bit-set (0.5), ena(0.9) (#374)
- Regex to 1.0 (#375)
Thanks to the following contributors:
- @Marwes
- @KRITZCREEK
- @asyosec
- snsmac
- Eijebong
- @sanxiyn
Features:
- Make semicolon after
}
in rules optional (#355)
Fixes:
- Use hash to decide whether to recompile (#369)
- Reduce the compile times of generated parse table parsers (#366)
Thanks to the following contributors for this release:
- @matklad
- @psl8
- @Marwes
Fixes:
- Don't overflow the stack in parse table debug builds (#337)
- Use the correct type for
!
in macro expanded productions (#335) - Allow lalrpop parsers to be used with include! (#338)
- Remove dependency on docopt, rustc-serialize, update itertools (#344, #345)
- Correctly anchor regex at the beginning (#358)
Thanks to the following contributors for this release:
- @Marwes
- @mbrubek
- @waywardmonkeys
- @sanxiyn
- @17cupsofcoffee
- @matklad
Features:
- The source and binary size of generated parsers has been reduced (#324, #306)
- Regex compilation as part of the generated lexer can now be cached (#318)
- The documentation is now provided as a mbbook (#298)
Bugs fixed:
- Fixed a stack overflow in debug builds of large grammars (#337)
- The error terminal now gets the correct type assigned when part of macros (#335)
- Character literals now parse correctly in the parser files (#320)
Compatibility notes:
- To let regex compilation be cached, each parser are now generated as a struct
with a
parse
method instead of just a function. To upgrade, change each parse call fromparse_X(..)
toXParser::new().parse(..)
.
Thanks to the following contributors for this release:
- @Phlosioneer
- @waywardmonkeys
- @brendanzab
- @dtkerr
- @Marwes
- @ahmedcharles
- @udoprog
Bugs fixed:
- Infinite loop in error recovery fixed (#240).
- Bad error messages if a
;
was forgotten fixed (#276). - Grammar errors were sometimes incorrectly reported as "extra tokens" (#278)
extern
blocks now allowed even when not using a custom tokenizer (#261)ParseError
now implementsDisplay
- actions can now return a grammar's type parameter's associated type (#247)
- generated files are now rebuilt when there is a new LALRPOP version (#243)
Compatibility notes:
- As part of making
ParseError
implementDisplay
, the default error type changed from()
to&'static str
, so parse errors type may change fromlalrpop_util::ParseError<..., ()>
tolalrpop_util::ParseError<..., &'static str>
.
Thanks to the following contributors for this release:
- @fitzgen
- @joerivanruth
- @pyfisch
- @nick70
- @notriddle
- @vmx
- We now support
#![..]
attributes in.lalrpop
files. - We now use lane table by default: since the lane table algorithm
automatically generates compressed tables where possible, the
#[lalr]
attribute is still accepted, but has no effect.- If you encounter problems, please report bugs! In the meantime,
though, you can use the
LALRPOP_LANE_TABLE=disabled
environment variable to change back.
- If you encounter problems, please report bugs! In the meantime,
though, you can use the
- When the
<>
string is found within{}
inside of an action, it now generates a series ofx: x
pairs for each named valuex
. This is useful for struct constants, since you can do something like:<a:Foo> <b:Bar> => MyStruct { <> }
, ifMyStruct
had two fields nameda
andb
. - We now support character literal patterns in the external tokenizer pattern syntax.
- The lalrpop executable now supports
--version
. - We are (for now, at least) testing for compatibility with Rust 1.13. This minimal supported rustc version may change in the future, however.
- Misc bug fixes.
Thanks to the following contributors for this release:
- @jchlapinski
- @minijackson
- @nikomatsakis
- @ravenexp
- @ruuda
- @wieczyk
- @withoutboats
This is a bug release for LALRPOP. First, we have two major improvements to the generated lexer:
- The lexer now generates code that uses the
regex
crate. This results in far less code than the older style, and seems to preserve performance. - The lexer now supports custom priorities for regular expression tokens,
making it possible to support case-insensitive keywords.
- See the calculator2b example for details.
Second, we have a beta release of the new lane-table based
LR-table generation. Lane tables handle the full set of LR(1)
grammars but typically reduce much smaller state tables. This
feature eliminates the need to manually mark grammars as #[LALR]
.
Lane tables are not on by default; you can enable them by setting
LALRPOP_LANE_TABLE=enabled
in your environment (use
std::env::set_var
in your build.rs
).
Finally, the lalrpop
executable now has the ability to generate
standalone reports (--report
).
Fixed bugs:
- Fix #157: We now recognize single quote (
'
) properly in our tokenizer. - Fix #179: Fix bug in recursive ascent code generation.
Thanks to the following contributors to this release:
- @ahmedcharles
- @king6cong
- @nikomatsakis
- @nixpulvis
- @wagenet
- @wieczyk
- Add the expected successor tokens to
UnrecognizedToken
errors (thanks @Marwes!). - Fix to error recovery doing a bad cast (thanks @Marwes!).
Major new feature! @Marwes added support for error recovery.
There have also been a number of other improvements:
- The
ParseError
type now implementsError
andDisplay
(thanks @Marwes!). - We no longer emit comments in generated code by default (thanks @Marwes!).
(Yanked due to minor backwards incompatibility.)
Bug fix release. Major bugs addressed:
Also, there is now a tutorial for writing custom lexers. Thanks @malleusinferni!.
Enabled a new table-driven code-generator by default. This generates less code than the older recursive-ascent-based generation scheme, but may parse less efficiently. To go back to the old scheme, annotate the grammar declaration:
#[recursive_ascent] grammar;
Also, the syntax for requesting LALR-generation has changed to use an annotation:
#[LALR] grammar;
We no longer emit module-level attributes, which means that unused imports in your .lalrpop file may start getting warnings. Thanks @dflemstr!
An overflow bug in LALRPOP was fixed. Thanks @larsluthman!
We no longer depend on time
, but now use
std::time
. Thanks @serprex!
There is now a Configuration
object for use in your build.rs
scripts. And,
thanks to @dflemstr!,
it permits you to configure the directory where LALRPOP output is
generated.
Fixed a bug in the LALRPOP option parsing. Thanks @Nemikolh!
Various typos and small corrections. Thanks @reuben and @ashleygwilliams!
Updated to use the regex-syntax
crate for regular expression
parsing instead of rolling our own parser. This means we can now
support the same regular expression syntax as the regex crate,
and in particular can support unicode character classes like \p{Greek}
.
Note that some regex features -- such as non-greedy repetition and
named capture groups -- are still not supported (or just not meaningful).
Optimized LR(1) construction time by approximately 5x.
Improved handling of location tokens @L
and @R
so that they can be
freely used without ever causing parse conflicts.
Major update to LALRPOP error messages in cases of shift/reduce and reduce/reduce conflicts. The messages now try to explain the problem in terms of your grammar, as well as diagnosing common problem scenarios and suggesting solutions.
Added a standalone LALRPOP executable.
We no longer generate incomplete files when grammar generation fails (Issue #57).
Miscellaneous bug fixes, mostly. Processing for a build.rs
file now
starts from the project directory, rather than being hardcoded to
start from src
.
Add support for inlining nonterminals. Nonterminals can now be
annotated with #[inline]
. If you do so, each use of the nonterminal
will be inlined into its place. This can be very helpful in addressing
shift-reduce or reduce-reduce conflicts, at the cost of a larger
grammar. We now inline Foo*
, Foo?
, and (Foo Bar)
nonterminals by
default.
This is mostly a bug-fix release.
Various minor issues were addressed:
- Issue #25: Unbalanced parens in string literals appearing in code now work properly.
- Issue #32: Regular expression parsing consumed infinite memory when a
.
appeared. - Issue #34: Automatic tokenizer generation did not play well with generic type parameters.
I hadn't yet started writing release notes, sorry.