forked from rrevenantt/antlr4
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Rust target, named alternatives/childs Almost full Rust target support, Rust target implementation All Rust target tests passing, added CI, and target related docs fix testsuite, some cleanup Squashed 'runtime/Rust/' changes from 307b806d9..a44046d9e a44046d9e some cleanup, documentation and performace optimization git-subtree-dir: runtime/Rust git-subtree-split: a44046d9eb6feb0405383aa846a709128a20e5ec add CI and CD add CI and CD minor fix Squashed 'runtime/Rust/' changes from a44046d9e..b94028f34 b94028f34 update README, fix for getter on optional rule git-subtree-dir: runtime/Rust git-subtree-split: b94028f34b12e49a861a931194ed6de008094eb5 Squashed 'runtime/Rust/' changes from b94028f34..13d5a35cd 13d5a35cd fixed sometimes missing hash for prediction context d562544f2 update README 20491fc2f update README, fix for getter on optional rule REVERT: b94028f34 update README, fix for getter on optional rule git-subtree-dir: runtime/Rust git-subtree-split: 13d5a35cd3b11763f73278633290a23d2f61caf1 rust target v0.2, visitor, zero-copy, custom tokens Squashed 'runtime/Rust/' changes from 13d5a35cd..f8beaf8b6 f8beaf8b6 fixed visitor architecture f8da12f9e update readme and use rustfmt for formatting d28736137 Fix `enterXXX` listener calls for alternative labels (antlr#13) f0a2da766 fully finished support for zero-copy, generic token, and generic underlying data. fdbf64f0f finished generic token support(almost, amend this) d765c850a preliminary byte parser support, almost fully refotmatted with rustfmt 6bb617b51 more flexible tree structure and listener can have any lifetime now, more type safety 1188be780 zero-copy done, input stream changed accordingly. 2e75727b8 zero-copy, input_stream rewritten, docs improved 679319354 wip zero-copy, almost done, most of the tests passing 7833ab8fe wip zerocopy, compiles/passes tests successfully, only parse tree changes remaining 466b370dc wip zerocopy x2, lib compiles successfully 97cb6f8e5 wip zerocopy, compiles successfully d8078f5fa minor adjustments 6aa622437 added proper build.rs, first change for next version - generic over token type 5bf0b080f fixed sometimes missing hash for prediction context REVERT: 13d5a35cd fixed sometimes missing hash for prediction context git-subtree-dir: runtime/Rust git-subtree-split: f8beaf8b6d54cffa9d262abc54ef8d89511544d3 update CI added proper downcasting, more generic input and error strategy, fixed context downcasting for predicates Squashed 'runtime/Rust/' changes from f8beaf8b6..73e3450fe 73e3450fe remove redundant generation artifacts b39c86a1a update dependencies 57775f2fa updated documentation and readme be1ccd343 support downcasting in parser code, fixed some warnings b054ca838 remove unnecessary boxing from apis, make ErrorStrategy generic to allow more inlining git-subtree-dir: runtime/Rust git-subtree-split: 73e3450fefc210949328d6cfd2cb0dc960e5972c fixed cargo flags for CI Rust target: support arbitrary visitor lifetime Squashed 'runtime/Rust/' changes from 73e3450fe..bc5460134 bc5460134 rewrite visitor architecture to support arbitrary visitor lifetime fa24927d5 exclude build.rs from package 8a0b33619 remove unused functions git-subtree-dir: runtime/Rust git-subtree-split: bc546013454e49badcdbb520e21a271054edf486 refactorings for 0.2 release, fix for separate file lexer and parser move rust target to submodule Use fork antlr-rust
- Loading branch information
1 parent
e4c1a74
commit 51edd90
Showing
54 changed files
with
4,364 additions
and
252 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
[submodule "runtime/PHP"] | ||
path = runtime/PHP | ||
url = https://github.com/antlr/antlr-php-runtime.git | ||
[submodule "runtime/Rust"] | ||
path = runtime/Rust | ||
url = https://github.com/nmeylan/antlr4rust |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
|
||
set -euo pipefail | ||
|
||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain nightly-2020-12-23 -y | ||
export PATH=$HOME/.cargo/bin:$PATH | ||
( rustc --version ; cargo --version ) || true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/bin/bash | ||
|
||
set -euo pipefail | ||
|
||
export PATH=$HOME/.cargo/bin:$PATH | ||
mvn test -Dtest=rust.*Left* -q |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
[package] | ||
name = "antlr-rust" | ||
version = "0.2.0-dev.2" | ||
authors = ["Konstantin Anisimov <rrevenantt@gmail.com>"] | ||
homepage = "https://github.com/rrevenantt/antlr4rust" | ||
repository = "https://github.com/rrevenantt/antlr4rust" | ||
documentation = "https://docs.rs/antlr-rust" | ||
description = "ANTLR4 runtime for Rust" | ||
readme = "README.md" | ||
edition = "2018" | ||
license = "BSD-3-Clause" | ||
keywords = ["ANTLR","ANTLR4","parsing","runtime"] | ||
categories = ["parsing"] | ||
exclude = ["build.rs"] | ||
|
||
[dependencies] | ||
lazy_static = "^1.4" | ||
uuid = "=0.8.*" | ||
byteorder = "^1" | ||
murmur3 = "=0.4" # 0.5 is incompatible currently | ||
bit-set = "=0.5.*" | ||
once_cell = "^1.2" | ||
#backtrace = "=0.3" | ||
typed-arena = "^2.0" | ||
better_any = "=0.1" | ||
|
||
[lib] | ||
|
||
#[[test]] | ||
#name = "my_test" | ||
#path="tests/my_test.rs" | ||
|
||
|
||
[profile.release] | ||
#opt-level = 3 | ||
#debug = true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,64 +1,114 @@ | ||
# ANTLR v4 | ||
|
||
[![Java 7+](https://img.shields.io/badge/java-7+-4c7e9f.svg)](http://java.oracle.com) | ||
[![License](https://img.shields.io/badge/license-BSD-blue.svg)](https://raw.githubusercontent.com/antlr/antlr4/master/LICENSE.txt) | ||
|
||
**Build status** | ||
|
||
[![Github CI Build Status (MacOSX)](https://img.shields.io/github/workflow/status/antlr/antlr4/MacOSX?label=MacOSX)](https://github.com/antlr/antlr4/actions) | ||
[![AppVeyor CI Build Status (Windows)](https://img.shields.io/appveyor/build/parrt/antlr4?label=Windows)](https://ci.appveyor.com/project/parrt/antlr4) | ||
[![Circle CI Build Status (Linux)](https://img.shields.io/circleci/build/gh/antlr/antlr4/master?label=Linux)](https://app.circleci.com/pipelines/github/antlr/antlr4) | ||
[![Travis-CI Build Status (Swift-Linux)](https://img.shields.io/travis/antlr/antlr4.svg?label=Linux-Swift&branch=master)](https://travis-ci.com/github/antlr/antlr4) | ||
|
||
**ANTLR** (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build parse trees and also generates a listener interface (or visitor) that makes it easy to respond to the recognition of phrases of interest. | ||
|
||
*Given day-job constraints, my time working on this project is limited so I'll have to focus first on fixing bugs rather than changing/improving the feature set. Likely I'll do it in bursts every few months. Please do not be offended if your bug or pull request does not yield a response! --parrt* | ||
|
||
[![Donate](https://www.paypal.com/en_US/i/btn/x-click-butcc-donate.gif)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=BF92STRXT8F8Q) | ||
|
||
## Authors and major contributors | ||
|
||
* [Terence Parr](http://www.cs.usfca.edu/~parrt/), parrt@cs.usfca.edu | ||
ANTLR project lead and supreme dictator for life | ||
[University of San Francisco](http://www.usfca.edu/) | ||
* [Sam Harwell](http://tunnelvisionlabs.com/) (Tool co-author, Java and original C# target) | ||
* [Eric Vergnaud](https://github.com/ericvergnaud) (Javascript, Python2, Python3 targets and maintenance of C# target) | ||
* [Peter Boyer](https://github.com/pboyer) (Go target) | ||
* [Mike Lischke](http://www.soft-gems.net/) (C++ completed target) | ||
* Dan McLaughlin (C++ initial target) | ||
* David Sisson (C++ initial target and test) | ||
* [Janyou](https://github.com/janyou) (Swift target) | ||
* [Ewan Mellor](https://github.com/ewanmellor), [Hanzhou Shi](https://github.com/hanjoes) (Swift target merging) | ||
* [Ben Hamilton](https://github.com/bhamiltoncx) (Full Unicode support in serialized ATN and all languages' runtimes for code points > U+FFFF) | ||
* [Marcos Passos](https://github.com/marcospassos) (PHP target) | ||
* [Lingyu Li](https://github.com/lingyv-li) (Dart target) | ||
|
||
## Useful information | ||
|
||
* [Release notes](https://github.com/antlr/antlr4/releases) | ||
* [Getting started with v4](https://github.com/antlr/antlr4/blob/master/doc/getting-started.md) | ||
* [Official site](http://www.antlr.org/) | ||
* [Documentation](https://github.com/antlr/antlr4/blob/master/doc/index.md) | ||
* [FAQ](https://github.com/antlr/antlr4/blob/master/doc/faq/index.md) | ||
* [ANTLR code generation targets](https://github.com/antlr/antlr4/blob/master/doc/targets.md)<br>(Currently: Java, C#, Python2|3, JavaScript, Go, C++, Swift, Dart, PHP) | ||
* [Java API](http://www.antlr.org/api/Java/index.html) | ||
* [ANTLR v3](http://www.antlr3.org/) | ||
* [v3 to v4 Migration, differences](https://github.com/antlr/antlr4/blob/master/doc/faq/general.md) | ||
|
||
You might also find the following pages useful, particularly if you want to mess around with the various target languages. | ||
# antlr4rust | ||
[![Crate](https://flat.badgen.net/crates/v/antlr-rust)](https://crates.io/crates/antlr_rust) | ||
[![docs](https://flat.badgen.net/badge/docs.rs/v0.2.0-dev.2)](https://docs.rs/antlr-rust/0.2.0-dev.2) | ||
|
||
[ANTLR4](https://github.com/antlr/antlr4) runtime for Rust programming language. | ||
|
||
Tool(generator) part is currently located in rust-target branch of my antlr4 fork [rrevenantt/antlr4/tree/rust-target](https://github.com/rrevenantt/antlr4/tree/rust-target) | ||
Latest version is automatically built to [releases](https://github.com/rrevenantt/antlr4rust/releases) on this repository. | ||
Also you can checkout it and `mvn -DskipTests install` | ||
|
||
For examples you can see [grammars](grammars), [tests/gen](tests/gen) for corresponding generated code | ||
and [tests/my_tests.rs](tests/my_test.rs) for actual usage examples | ||
|
||
### Implementation status | ||
|
||
For now development is going on in this repository | ||
but eventually it will be merged to main ANTLR4 repo | ||
|
||
Currently, requires nightly version of rust. | ||
This likely will be the case until `coerce_unsize` or some kind of coercion trait is stabilized. | ||
There are other unstable features in use but only `CoerceUnsized` is essential. | ||
|
||
Remaining things before merge: | ||
- API stabilization | ||
- [ ] Rust api guidelines compliance | ||
- [ ] more tests for API because it is quite different from Java | ||
|
||
Can be done after merge: | ||
- more profiling and performance optimizations | ||
- Documentation | ||
- [ ] Some things are already documented but still far from perfect, also more links needed. | ||
- Code quality | ||
- [ ] Clippy sanitation | ||
- [ ] Not all warning are fixed | ||
- cfg to not build potentially unnecessary parts | ||
(no Lexer if custom token stream, no ParserATNSimulator if LL(1) grammar) | ||
- run rustfmt on generated parser | ||
###### Long term improvements | ||
- generate enum for labeled alternatives without redundant `Error` option | ||
- option to generate fields instead of getters by default and make visiting based on fields | ||
- make tree generic over pointer type and allow tree nodes to arena. | ||
(requires GAT, otherwise it would be a problem for users that want ownership for parse tree) | ||
- support stable rust | ||
- support no_std(although alloc would still be required) | ||
|
||
### Usage | ||
|
||
You should use the ANTLR4 "tool" to generate a parser, that will use the ANTLR | ||
runtime, located here. You can run it with the following command: | ||
```bash | ||
java -jar <path to ANTLR4 tool> -Dlanguage=Rust MyGrammar.g4 | ||
``` | ||
For a full list of antlr4 tool options, please visit the | ||
[tool documentation page](https://github.com/antlr/antlr4/blob/master/doc/tool-options.md). | ||
|
||
You can also see [build.rs](build.rs) as an example of `build.rs` configuration | ||
to rebuild parser automatically if grammar file was changed. | ||
|
||
Then add following to `Cargo.toml` of the crate from which generated parser | ||
is going to be used: | ||
```toml | ||
[dependencies] | ||
antlr-rust = "=0.2.0-dev.1" | ||
``` | ||
and `#![feature(try_blocks)]` in your project root module. | ||
|
||
* [How to build ANTLR itself](https://github.com/antlr/antlr4/blob/master/doc/building-antlr.md) | ||
* [How we create and deploy an ANTLR release](https://github.com/antlr/antlr4/blob/master/doc/releasing-antlr.md) | ||
|
||
## The Definitive ANTLR 4 Reference | ||
|
||
Programmers run into parsing problems all the time. Whether it’s a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language—ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to build parsers and the language applications built on top. This completely rewritten new edition of the bestselling Definitive ANTLR Reference shows you how to take advantage of these new features. | ||
|
||
You can buy the book [The Definitive ANTLR 4 Reference](http://amzn.com/1934356999) at amazon or an [electronic version at the publisher's site](https://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). | ||
### Parse Tree structure | ||
|
||
It is possible to generate idiomatic Rust syntax trees. For this you would need to use labels feature of ANTLR tool. | ||
You can see [Labels](grammars/Labels.g4) grammar for example. | ||
Consider following rule : | ||
```text | ||
e : a=e op='*' b=e # mult | ||
| left=e '+' b=e # add | ||
``` | ||
For such rule ANTLR will generate enum `EContextAll` containing `mult` and `add` alternatives, | ||
so you will be able to match on them in your code. | ||
Also corresponding struct for each alternative will contain fields you labeled. | ||
I.e. for `MultContext` struct will contain `a` and `b` fields containing child subtrees and | ||
`op` field with `TerminalNode` type which corresponds to individual `Token`. | ||
It also is possible to disable generic parse tree creation to keep only selected children via | ||
`parser.build_parse_trees = false`, but unfortunately currently it will prevent visitors from working. | ||
|
||
### Differences with Java | ||
Although Rust runtime API has been made as close as possible to Java, | ||
there are quite some differences because Rust is not an OOP language and is much more explicit. | ||
|
||
- If you are using labeled alternatives, | ||
struct generated for the rule is an enum with variant for each alternative | ||
- Parser needs to have ownership for listeners, but it is possible to get listener back via `ListenerId` | ||
otherwise `ParseTreeWalker` should be used. | ||
- In embedded actions to access parser you should use `recog` variable instead of `self`/`this`. | ||
This is because predicates have to be inserted into two syntactically different places in generated parser | ||
and in one of them it is impossible to have parser as `self`. | ||
- str based `InputStream` have different index behavior when there are unicode characters. | ||
If you need exactly the same behavior, use `[u32]` based `InputStream`, or implement custom `CharStream`. | ||
- In actions you have to escape `'` in rust lifetimes with `\ ` because ANTLR considers them as strings, e.g. `Struct<\'lifetime>` | ||
- To make custom tokens you should use `@tokenfactory` custom action, instead of usual `TokenLabelType` parser option. | ||
ANTLR parser options can accept only single identifiers while Rust target needs know about lifetime as well. | ||
Also in Rust target `TokenFactory` is the way to specify token type. As example you can see [CSV](grammars/CSV.g4) test grammar. | ||
- All rule context variables (rule argument or rule return) should implement `Default + Clone`. | ||
|
||
### Unsafe | ||
Currently, unsafe is used only for downcasting (through separate crate) | ||
and to update data inside Rc via `get_mut_unchecked`(returned mutable reference is used immediately and not stored anywhere) | ||
|
||
You will find the [Book source code](http://pragprog.com/titles/tpantlr2/source_code) useful. | ||
### Versioning | ||
In addition to usual Rust semantic versioning, | ||
patch version changes of the crate should not require updating of generator part | ||
|
||
## Licence | ||
|
||
## Additional grammars | ||
[This repository](https://github.com/antlr/grammars-v4) is a collection of grammars without actions where the | ||
root directory name is the all-lowercase name of the language parsed | ||
by the grammar. For example, java, cpp, csharp, c, etc... | ||
BSD 3-clause |
Oops, something went wrong.