Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
yui-knk committed Sep 22, 2024
1 parent dee8ffe commit eebee02
Showing 1 changed file with 93 additions and 9 deletions.
102 changes: 93 additions & 9 deletions doc/development/compressed_state_table/main.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,45 @@
# Compressed State Table

LR parser generates two large tables, action table and GOTO table.
Action table is a matrix of current state and token. Each cell of action table indicates next action (shift, reduce, accept and error).
GOTO table is a matrix of current state and nonterminal symbol. Each cell of GOTO table indicates next state.
Action table is a matrix of states and tokens. Each cell of action table indicates next action (shift, reduce, accept and error).
GOTO table is a matrix of states and nonterminal symbols. Each cell of GOTO table indicates next state.

Action table of "parse.y":

| |EOF| LF|NUM|'+'|'*'|'('|')'|
|--------|--:|--:|--:|--:|--:|--:|--:|
|State 0| r1| | s1| | | s2| |
|State 1| r3| r3| r3| r3| r3| r3| r3|
|State 2| | | s1| | | s2| |
|State 3| s6| | | | | | |
|State 4| | s7| | s8| s9| | |
|State 5| | | | s8| s9| |s10|
|State 6|acc|acc|acc|acc|acc|acc|acc|
|State 7| r2| r2| r2| r2| r2| r2| r2|
|State 8| | | s1| | | s2| |
|State 9| | | s1| | | s2| |
|State 10| r6| r6| r6| r6| r6| r6| r6|
|State 11| | r4| | r4| s9| | r4|
|State 12| | r5| | r5| r5| | r5|

GOTO table of "parse.y":

| |$accept|program|expr|
|--------|------:|------:|---:|
|State 0| | g3| g4|
|State 1| | | |
|State 2| | | g5|
|State 3| | | |
|State 4| | | |
|State 5| | | |
|State 6| | | |
|State 7| | | |
|State 8| | | g11|
|State 9| | | g12|
|State 10| | | |
|State 11| | | |
|State 12| | | |


Both action table and GOTO table are sparse. Therefore LR parser generator compresses both tables and creates these tables.

Expand All @@ -17,7 +54,8 @@ See also: https://speakerdeck.com/yui_knk/what-is-expected?slide=52

### `yypact` & `yypgoto`

`yypact` specifies what to do on the current state.
`yypact` specifies offset on `yytable` for the current state.
As an optimization, `yypact` also specifies default reduce action for some states.
Accessing the value by `state`. For example,

```ruby
Expand Down Expand Up @@ -48,7 +86,11 @@ end

### `yytable`

`yytable` specifies what actually to do on the current state.
`yytable` is a mixture of action table and GOTO table.

#### For action table

For action table, `yytable` specifies what actually to do on the current state.

Positive number means shift and specifies next state.
For example, `yytable[yyn] == 1` means shift and next state is State 1.
Expand All @@ -59,6 +101,13 @@ For example, `yytable[yyn] == YYTABLE_NINF` means syntax error.
Other negative number and zero mean reducing with the rule whose number is opposite.
For example, `yytable[yyn] == -1` means reduce with Rule 1.

#### For GOTO table

For GOTO table, `yytable` specifies the next state for given LSH nonterminal.

The value is always positive number which means next state id.
It never becomes `YYTABLE_NINF`.

### `yycheck`

`yycheck` validates accesses to `yytable`.
Expand Down Expand Up @@ -90,7 +139,10 @@ yytable = [
`yypact` is an array of each state offset.

```ruby
yypact = [0, 1]
yypact = [
0, # State 0 is not shifted
1 # State 1 is shifted one to right
]
```

We can access the value of `state1[2]` by consulting `yypact`.
Expand Down Expand Up @@ -298,7 +350,7 @@ In this case, `0` is the minimum offset number then `YYTABLE_NINF` is `-1`.
### `yypact` & `yypgoto`

`yypact` & `yypgoto` are mixture of offset in `yytable` and `YYPACT_NINF` (default reduce action).
The index in `yypact` is state id, the index in `yypgoto` is nonterminal symbol id.
Index in `yypact` is state id and index in `yypgoto` is nonterminal symbol id.
`YYPACT_NINF` is the minimum negative number.
In this case, `-3` is the minimum offset number then `YYPACT_NINF` is `-4`.

Expand Down Expand Up @@ -374,10 +426,10 @@ yydefgoto = [

### `yyr1` & `yyr2`

Both of them are Rule table.
Both of them are tables for rules.
`yyr1` specifies nonterminal symbol id of rule's Left-Hand-Side.
`yyr2` specifies the length of the rule, number of symbols on the rule's Right-Hand-Side.
Index 0
`yyr2` specifies the length of the rule, that is, number of symbols on the rule's Right-Hand-Side.
Index 0 is not used because Rule id starts with 1.

```ruby
yyr1 = [
Expand All @@ -394,6 +446,10 @@ yyr2 = [

## How to use tables

See also "parse.rb" which implements LALR parser based on "parse.y" file.

At first, define important constants and arrays:

```ruby
YYNTOKENS = 9

Expand All @@ -419,6 +475,9 @@ yyr2 = [ 0, 2, 0, 2, 1, 3, 3, 3]

Determine what to do next based on current state (`state`) and next token (`yytoken`).

The first step to decide action is looking up `yypact` table by current state.
If only default reduce exists for the current state, `yypact` returns `YYPACT_NINF`.

```ruby
# Case 1: Only default reduce exists for the state
#
Expand All @@ -438,6 +497,11 @@ if offset == YYPACT_NINF # true
end
```

If both shift and default reduce exists for the current state, `yypact` returns offset in `yytable`.
Index is the sum of `offset` and `yytoken`.
Need to check index before access to `yytable` by consulting `yycheck`.
Index can be out of range because blank cells on head and tail are omitted then need to check index is not less than 0 and not greater than `YYLAST`, see how `yycheck` is constructed in the example above.

```ruby
# Case 2: Both shift and default reduce exists for the state
#
Expand Down Expand Up @@ -493,6 +557,26 @@ end

### Execute (default) reduce

Once next action is decided to default reduce, need to determine

1. the rule to be applied
2. the next state from GOTO table

Rule id for the default reduce is stored in `yydefact`.
`0` in `yydefact` means syntax error so need to check the value is not `0` before continue the process.

Once rule is determined, the lenght of the rule can be decided from `yyr2` and the LHS nonterminal can be decided from `yyr1`.

The next state is determined by LHS nonterminal and the state after reduce.
GOTO table is also compressed into `yytable` then the process to decide next state is similar to `yypact`.

1. Look up `yypgoto` by LHS nonterminal. Note `yypact` is indexed by state but `yypgoto` is indexed by nonterminal.
2. Check the value on `yypgoto` is `YYPACT_NINF` is not.
3. Check the index, sum of offset and state, is out of range or not.
4. Check `yycheck` table before access to `yytable`.

Finally push the state to the stack.

```ruby
# State 11
#
Expand Down

0 comments on commit eebee02

Please sign in to comment.