Skip to content

Commit

Permalink
Rust target implementation
Browse files Browse the repository at this point in the history
Rust target, named alternatives/childs
Almost full Rust target support,
Rust target implementation
All Rust target tests passing, added CI, and target related docs

fix testsuite, some cleanup

Squashed 'runtime/Rust/' changes from 307b806d9..a44046d9e

a44046d9e some cleanup, documentation and performace optimization

git-subtree-dir: runtime/Rust
git-subtree-split: a44046d9eb6feb0405383aa846a709128a20e5ec

add CI and CD

add CI and CD

minor fix

Squashed 'runtime/Rust/' changes from a44046d9e..b94028f34

b94028f34 update README, fix for getter on optional rule

git-subtree-dir: runtime/Rust
git-subtree-split: b94028f34b12e49a861a931194ed6de008094eb5

Squashed 'runtime/Rust/' changes from b94028f34..13d5a35cd

13d5a35cd fixed sometimes missing hash for prediction context
d562544f2 update README
20491fc2f update README, fix for getter on optional rule
REVERT: b94028f34 update README, fix for getter on optional rule

git-subtree-dir: runtime/Rust
git-subtree-split: 13d5a35cd3b11763f73278633290a23d2f61caf1

rust target v0.2, visitor, zero-copy, custom tokens

Squashed 'runtime/Rust/' changes from 13d5a35cd..f8beaf8b6

f8beaf8b6 fixed visitor architecture
f8da12f9e update readme and use rustfmt for formatting
d28736137 Fix `enterXXX` listener calls for alternative labels (antlr#13)
f0a2da766 fully finished support for zero-copy, generic token, and generic underlying data.
fdbf64f0f finished generic token support(almost, amend this)
d765c850a preliminary byte parser support, almost fully refotmatted with rustfmt
6bb617b51 more flexible tree structure and listener can have any lifetime now, more type safety
1188be780 zero-copy done, input stream changed accordingly.
2e75727b8 zero-copy, input_stream rewritten, docs improved
679319354 wip zero-copy, almost done, most of the tests passing
7833ab8fe wip zerocopy, compiles/passes tests successfully, only parse tree changes remaining
466b370dc wip zerocopy x2, lib compiles successfully
97cb6f8e5 wip zerocopy, compiles successfully
d8078f5fa minor adjustments
6aa622437 added proper build.rs, first change for next version - generic over token type
5bf0b080f fixed sometimes missing hash for prediction context
REVERT: 13d5a35cd fixed sometimes missing hash for prediction context

git-subtree-dir: runtime/Rust
git-subtree-split: f8beaf8b6d54cffa9d262abc54ef8d89511544d3

update CI

added proper downcasting, more generic input and error strategy, fixed context downcasting for predicates

Squashed 'runtime/Rust/' changes from f8beaf8b6..73e3450fe

73e3450fe remove redundant generation artifacts
b39c86a1a update dependencies
57775f2fa updated documentation and readme
be1ccd343 support downcasting in parser code, fixed some warnings
b054ca838 remove unnecessary boxing from apis, make ErrorStrategy generic to allow more inlining

git-subtree-dir: runtime/Rust
git-subtree-split: 73e3450fefc210949328d6cfd2cb0dc960e5972c

fixed cargo flags for CI

Rust target: support arbitrary visitor lifetime

Squashed 'runtime/Rust/' changes from 73e3450fe..bc5460134

bc5460134 rewrite visitor architecture to support arbitrary visitor lifetime
fa24927d5 exclude build.rs from package
8a0b33619 remove unused functions

git-subtree-dir: runtime/Rust
git-subtree-split: bc546013454e49badcdbb520e21a271054edf486

refactorings for 0.2 release, fix for separate file lexer and parser

move rust target to submodule

Use fork antlr-rust
  • Loading branch information
rrevenantt authored and nmeylan committed May 14, 2022
1 parent e4c1a74 commit 51edd90
Show file tree
Hide file tree
Showing 54 changed files with 4,364 additions and 252 deletions.
100 changes: 6 additions & 94 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,98 +1,10 @@
# Maven build folders
target/
# ... but not code generation targets
!tool/src/org/antlr/v4/codegen/target/

# Node.js (npm and typings) cached dependencies
node_modules/
typings/

# Ant build folders
build/
dist/
lib/
user.build.properties

# MacOSX files
.DS_Store

## Python, selected lines from https://raw.githubusercontent.com/github/gitignore/master/Python.gitignore
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

## CSharp and VisualStudio, selected lines from https://raw.githubusercontent.com/github/gitignore/master/VisualStudio.gitignore
# User-specific files
*.suo
*.user
*.userosscache
*.sln.docstates

# User-specific files (MonoDevelop/Xamarin Studio)
*.userprefs
*.user
.vs/
project.lock.json

# Build results
[Dd]ebug/
[Dd]ebugPublic/
[Rr]elease/
[Rr]eleases/
x64/
x86/
bld/
[Bb]in/
[Oo]bj/
[Ll]og/

# Visual Studio 2015 cache/options directory
.vs/

# NetBeans user configuration files
nbactions*.xml
/nbproject/private/
*/nbproject/private/

# IntelliJ projects
*.iws
*.iml
.idea/

# Eclipse projects
.classpath
.project
.settings/
.metadata

# Profiler results
*.hprof

# parrt's bash prompt data
.fetch_time_cache

# Playground
#/tool/playground/

# Generated files
/out/
/gen/
/gen3/
/gen4/
/tool/playground/
tmp/

# Configurable build files
bilder.py
bilder.pyc
bild.log

bild_output.txt
runtime/Cpp/demo/generated
xcuserdata
*.jar
.idea
.vscode
/target
/tests/gen/*.tokens
/tests/gen/*.interp
**/*.rs.bk
Cargo.lock

# VSCode Java plugin temporary files
javac-services.0.log
Expand Down
6 changes: 6 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[submodule "runtime/PHP"]
path = runtime/PHP
url = https://github.com/antlr/antlr-php-runtime.git
[submodule "runtime/Rust"]
path = runtime/Rust
url = https://github.com/nmeylan/antlr4rust
7 changes: 7 additions & 0 deletions .travis/before-install-linux-rust.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

set -euo pipefail

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain nightly-2020-12-23 -y
export PATH=$HOME/.cargo/bin:$PATH
( rustc --version ; cargo --version ) || true
6 changes: 6 additions & 0 deletions .travis/run-tests-rust.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

set -euo pipefail

export PATH=$HOME/.cargo/bin:$PATH
mvn test -Dtest=rust.*Left* -q
36 changes: 36 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
[package]
name = "antlr-rust"
version = "0.2.0-dev.2"
authors = ["Konstantin Anisimov <rrevenantt@gmail.com>"]
homepage = "https://github.com/rrevenantt/antlr4rust"
repository = "https://github.com/rrevenantt/antlr4rust"
documentation = "https://docs.rs/antlr-rust"
description = "ANTLR4 runtime for Rust"
readme = "README.md"
edition = "2018"
license = "BSD-3-Clause"
keywords = ["ANTLR","ANTLR4","parsing","runtime"]
categories = ["parsing"]
exclude = ["build.rs"]

[dependencies]
lazy_static = "^1.4"
uuid = "=0.8.*"
byteorder = "^1"
murmur3 = "=0.4" # 0.5 is incompatible currently
bit-set = "=0.5.*"
once_cell = "^1.2"
#backtrace = "=0.3"
typed-arena = "^2.0"
better_any = "=0.1"

[lib]

#[[test]]
#name = "my_test"
#path="tests/my_test.rs"


[profile.release]
#opt-level = 3
#debug = true
172 changes: 111 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,114 @@
# ANTLR v4

[![Java 7+](https://img.shields.io/badge/java-7+-4c7e9f.svg)](http://java.oracle.com)
[![License](https://img.shields.io/badge/license-BSD-blue.svg)](https://raw.githubusercontent.com/antlr/antlr4/master/LICENSE.txt)

**Build status**

[![Github CI Build Status (MacOSX)](https://img.shields.io/github/workflow/status/antlr/antlr4/MacOSX?label=MacOSX)](https://github.com/antlr/antlr4/actions)
[![AppVeyor CI Build Status (Windows)](https://img.shields.io/appveyor/build/parrt/antlr4?label=Windows)](https://ci.appveyor.com/project/parrt/antlr4)
[![Circle CI Build Status (Linux)](https://img.shields.io/circleci/build/gh/antlr/antlr4/master?label=Linux)](https://app.circleci.com/pipelines/github/antlr/antlr4)
[![Travis-CI Build Status (Swift-Linux)](https://img.shields.io/travis/antlr/antlr4.svg?label=Linux-Swift&branch=master)](https://travis-ci.com/github/antlr/antlr4)

**ANTLR** (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build parse trees and also generates a listener interface (or visitor) that makes it easy to respond to the recognition of phrases of interest.

*Given day-job constraints, my time working on this project is limited so I'll have to focus first on fixing bugs rather than changing/improving the feature set. Likely I'll do it in bursts every few months. Please do not be offended if your bug or pull request does not yield a response! --parrt*

[![Donate](https://www.paypal.com/en_US/i/btn/x-click-butcc-donate.gif)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=BF92STRXT8F8Q)

## Authors and major contributors

* [Terence Parr](http://www.cs.usfca.edu/~parrt/), parrt@cs.usfca.edu
ANTLR project lead and supreme dictator for life
[University of San Francisco](http://www.usfca.edu/)
* [Sam Harwell](http://tunnelvisionlabs.com/) (Tool co-author, Java and original C# target)
* [Eric Vergnaud](https://github.com/ericvergnaud) (Javascript, Python2, Python3 targets and maintenance of C# target)
* [Peter Boyer](https://github.com/pboyer) (Go target)
* [Mike Lischke](http://www.soft-gems.net/) (C++ completed target)
* Dan McLaughlin (C++ initial target)
* David Sisson (C++ initial target and test)
* [Janyou](https://github.com/janyou) (Swift target)
* [Ewan Mellor](https://github.com/ewanmellor), [Hanzhou Shi](https://github.com/hanjoes) (Swift target merging)
* [Ben Hamilton](https://github.com/bhamiltoncx) (Full Unicode support in serialized ATN and all languages' runtimes for code points > U+FFFF)
* [Marcos Passos](https://github.com/marcospassos) (PHP target)
* [Lingyu Li](https://github.com/lingyv-li) (Dart target)

## Useful information

* [Release notes](https://github.com/antlr/antlr4/releases)
* [Getting started with v4](https://github.com/antlr/antlr4/blob/master/doc/getting-started.md)
* [Official site](http://www.antlr.org/)
* [Documentation](https://github.com/antlr/antlr4/blob/master/doc/index.md)
* [FAQ](https://github.com/antlr/antlr4/blob/master/doc/faq/index.md)
* [ANTLR code generation targets](https://github.com/antlr/antlr4/blob/master/doc/targets.md)<br>(Currently: Java, C#, Python2|3, JavaScript, Go, C++, Swift, Dart, PHP)
* [Java API](http://www.antlr.org/api/Java/index.html)
* [ANTLR v3](http://www.antlr3.org/)
* [v3 to v4 Migration, differences](https://github.com/antlr/antlr4/blob/master/doc/faq/general.md)

You might also find the following pages useful, particularly if you want to mess around with the various target languages.
# antlr4rust
[![Crate](https://flat.badgen.net/crates/v/antlr-rust)](https://crates.io/crates/antlr_rust)
[![docs](https://flat.badgen.net/badge/docs.rs/v0.2.0-dev.2)](https://docs.rs/antlr-rust/0.2.0-dev.2)

[ANTLR4](https://github.com/antlr/antlr4) runtime for Rust programming language.

Tool(generator) part is currently located in rust-target branch of my antlr4 fork [rrevenantt/antlr4/tree/rust-target](https://github.com/rrevenantt/antlr4/tree/rust-target)
Latest version is automatically built to [releases](https://github.com/rrevenantt/antlr4rust/releases) on this repository.
Also you can checkout it and `mvn -DskipTests install`

For examples you can see [grammars](grammars), [tests/gen](tests/gen) for corresponding generated code
and [tests/my_tests.rs](tests/my_test.rs) for actual usage examples

### Implementation status

For now development is going on in this repository
but eventually it will be merged to main ANTLR4 repo

Currently, requires nightly version of rust.
This likely will be the case until `coerce_unsize` or some kind of coercion trait is stabilized.
There are other unstable features in use but only `CoerceUnsized` is essential.

Remaining things before merge:
- API stabilization
- [ ] Rust api guidelines compliance
- [ ] more tests for API because it is quite different from Java

Can be done after merge:
- more profiling and performance optimizations
- Documentation
- [ ] Some things are already documented but still far from perfect, also more links needed.
- Code quality
- [ ] Clippy sanitation
- [ ] Not all warning are fixed
- cfg to not build potentially unnecessary parts
(no Lexer if custom token stream, no ParserATNSimulator if LL(1) grammar)
- run rustfmt on generated parser
###### Long term improvements
- generate enum for labeled alternatives without redundant `Error` option
- option to generate fields instead of getters by default and make visiting based on fields
- make tree generic over pointer type and allow tree nodes to arena.
(requires GAT, otherwise it would be a problem for users that want ownership for parse tree)
- support stable rust
- support no_std(although alloc would still be required)

### Usage

You should use the ANTLR4 "tool" to generate a parser, that will use the ANTLR
runtime, located here. You can run it with the following command:
```bash
java -jar <path to ANTLR4 tool> -Dlanguage=Rust MyGrammar.g4
```
For a full list of antlr4 tool options, please visit the
[tool documentation page](https://github.com/antlr/antlr4/blob/master/doc/tool-options.md).

You can also see [build.rs](build.rs) as an example of `build.rs` configuration
to rebuild parser automatically if grammar file was changed.

Then add following to `Cargo.toml` of the crate from which generated parser
is going to be used:
```toml
[dependencies]
antlr-rust = "=0.2.0-dev.1"
```
and `#![feature(try_blocks)]` in your project root module.

* [How to build ANTLR itself](https://github.com/antlr/antlr4/blob/master/doc/building-antlr.md)
* [How we create and deploy an ANTLR release](https://github.com/antlr/antlr4/blob/master/doc/releasing-antlr.md)

## The Definitive ANTLR 4 Reference

Programmers run into parsing problems all the time. Whether it’s a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language—ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to build parsers and the language applications built on top. This completely rewritten new edition of the bestselling Definitive ANTLR Reference shows you how to take advantage of these new features.

You can buy the book [The Definitive ANTLR 4 Reference](http://amzn.com/1934356999) at amazon or an [electronic version at the publisher's site](https://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference).
### Parse Tree structure

It is possible to generate idiomatic Rust syntax trees. For this you would need to use labels feature of ANTLR tool.
You can see [Labels](grammars/Labels.g4) grammar for example.
Consider following rule :
```text
e : a=e op='*' b=e # mult
| left=e '+' b=e # add
```
For such rule ANTLR will generate enum `EContextAll` containing `mult` and `add` alternatives,
so you will be able to match on them in your code.
Also corresponding struct for each alternative will contain fields you labeled.
I.e. for `MultContext` struct will contain `a` and `b` fields containing child subtrees and
`op` field with `TerminalNode` type which corresponds to individual `Token`.
It also is possible to disable generic parse tree creation to keep only selected children via
`parser.build_parse_trees = false`, but unfortunately currently it will prevent visitors from working.

### Differences with Java
Although Rust runtime API has been made as close as possible to Java,
there are quite some differences because Rust is not an OOP language and is much more explicit.

- If you are using labeled alternatives,
struct generated for the rule is an enum with variant for each alternative
- Parser needs to have ownership for listeners, but it is possible to get listener back via `ListenerId`
otherwise `ParseTreeWalker` should be used.
- In embedded actions to access parser you should use `recog` variable instead of `self`/`this`.
This is because predicates have to be inserted into two syntactically different places in generated parser
and in one of them it is impossible to have parser as `self`.
- str based `InputStream` have different index behavior when there are unicode characters.
If you need exactly the same behavior, use `[u32]` based `InputStream`, or implement custom `CharStream`.
- In actions you have to escape `'` in rust lifetimes with `\ ` because ANTLR considers them as strings, e.g. `Struct<\'lifetime>`
- To make custom tokens you should use `@tokenfactory` custom action, instead of usual `TokenLabelType` parser option.
ANTLR parser options can accept only single identifiers while Rust target needs know about lifetime as well.
Also in Rust target `TokenFactory` is the way to specify token type. As example you can see [CSV](grammars/CSV.g4) test grammar.
- All rule context variables (rule argument or rule return) should implement `Default + Clone`.

### Unsafe
Currently, unsafe is used only for downcasting (through separate crate)
and to update data inside Rc via `get_mut_unchecked`(returned mutable reference is used immediately and not stored anywhere)

You will find the [Book source code](http://pragprog.com/titles/tpantlr2/source_code) useful.
### Versioning
In addition to usual Rust semantic versioning,
patch version changes of the crate should not require updating of generator part

## Licence

## Additional grammars
[This repository](https://github.com/antlr/grammars-v4) is a collection of grammars without actions where the
root directory name is the all-lowercase name of the language parsed
by the grammar. For example, java, cpp, csharp, c, etc...
BSD 3-clause
Loading

0 comments on commit 51edd90

Please sign in to comment.