Better see it once than hear thousand times: Ruby to JSON online :)
1 + 3.14
["send"
["int", 1],
"+",
["float", 3.14]
]
a = 42
["lvasgn", "a", ["int", 42]]
$object.attribute = 123
["send",
["gvar", "$object"],
"attribute=",
["int", 123]
]
As expected, just calling a method.
a = 1
b = a
["begin",
["lvasgn", "a", ["int", 1]],
["lvasgn", "b", ["lvar", "a"]]
]
The second entry of a
is treated as a local variable ["lvar","a"]
, not a method.
begin
here is a mythical wrapper for more than one statement.
b = a
["lvasgn", "b", ["send", null, "a"]]
Calling a method on nil
means call the method on the self
of the current scope. Or anything your AST processor needs it to mean :)
[42, 2.7, 'hello', %{world}]
["array",
["int", 42],
["float", 2.7],
["str", "hello"],
["str", "world"]
]
{a: 1, :b => 2, c => 3}
["hash",
["pair", ["sym","a"], ["int",1]],
["pair", ["sym","b"], ["int",2]],
["pair", ["send",null,"c"], ["int",3]]
]
puts(<<AAA, <<BBB, <<CCC)
content of AAA
AAA
content of BBB
BBB
CCC
["send",
null, "puts",
// arguments
["str","content of AAA\n"],
["str","content of BBB\n"],
["dstr"] // empty string (#1)
]
Cascade of heredocs.
<<"AAA"
aaa
#{
<<BBB
bbb
#{<<CCC
ccc
CCC
}
BBB
}
AAA
["dstr",
["str", "aaa\n"],
["begin",
["dstr",
["str", "bbb\n"],
["begin",
["str", "ccc\n"]
],
["str", "\n"]
]
],
["str", "\n"]
]
Lots of dynamic strings.
$a.map { |e| e ** 2 }
["block",
// method call
["send", ["gvar","$a"], "map"],
// block arguments
["args",["arg","e"]],
// block body
["send",["lvar","e"],"**",["int",2]]
]
The send
node wrapped by the block
node. AST is so abstract :)
This yet another ruby parser consists of three parts:
* Lexer
* Parser
* Builder
The original ruby lexer/parser/ast-generator from parse.y
has all these three parts in one file, but, unfortunately, tangled tightly without any abstraction or a namespace isolation. All the ruby parsers out there have introduced strong isolation of the parts. So did we.
Lexer here is a line to line, bit to bit (I hope so) port of original parse.y
lexer from C to JavaScript. With all the goto
s emulated somehow. There are only a few regexps introduced for trivial things like checking alphabetical chars with /^[a-zA-Z]/
. It tries to mimic each and every warning and error it can reach.
Parser is a precise copy of the original bison rules from bison part of parse.y
. At this phase the parser relies on a port of bison state machine to JavaScript. It's a separated project called bison-lalr1.js; AFAICT, the JS port does exactly the same state transitions and produce identical error messages as the original yacc.c
bison skeleton does (compared with ruby -yc
).
Builder is a simplified port of the ruby parser AST generation part. It has a brilliant API, so the JS port tries to copy it word to word. As a pleasant outcome the JS port got the sexy sexp AST format too. And the AST documentation too.
It can lex, parse and build an AST for giant.rb
of ~49000 lines of ruby code copied from Opal project, ActiveRecord gem, and Realties gem. It takes the parser 1.3 sec to do all the job: 37ms for bootstrap, and 1249ms for parsing.
Further optimizations needed to reduce the garbage produced by the lexer and the builder phases.
brew install node
then:
node ruby2json.js 'puts(1 + 2.0)'
Prerequisites:
brew install node
npm install jasmine-node -g
Run:
make check
Standard Jasmine output follows. All the dots have to be green ;)
brew install v8 # for d8
brew install bison # 2.7.1 at the moment
and for make check
to work:
npm install jasmine-node -g
then change sample code in debug.rb
and run:
make compare
which will compare the parsing processes. If got bug, run:
make debug
this will log all steps of the parser with all intermediate values.
In case you feel that the problem is in lexer, run:
make debug-lexer
it will show all the token values lexer supplies.
If you need to touch around bison rules, please, run make diff
to see, what changed. There must be no diff to parse.js.output :)
Then check the lexer.
To briefly test the lexer+parser tandem, run make compare
. It will compare the parsing log of the original ruby and this parser. To do so you need to compile ruby 2.0.0-p195 with Bison 2.7.
To update the parser of the original ruby from Bison 2.3 (2006) to bison 2.7 (2012) install the new version of bison and delete file parse.c
from its sources to force make calling bison.
And one more thing. Make your very own ruby 2.0.0-p195 (patch level matters in corner cases) executable visible by ruby20
name.